什么是“流不包含有效的UTF-8"?意思是? [英] What does "Stream did not contain valid UTF-8" mean?

查看:677
本文介绍了什么是“流不包含有效的UTF-8"?意思是?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在创建一个简单的HTTP服务器.我需要阅读请求的图像并将其发送到浏览器.我正在使用以下代码:

fn read_file(mut file_name: String) -> String {
    file_name = file_name.replace("/", "");
    if file_name.is_empty() {
        file_name = String::from("index.html");
    }

    let path = Path::new(&file_name);
    if !path.exists() {
        return String::from("Not Found!");
    }
    let mut file_content = String::new();
    let mut file = File::open(&file_name).expect("Unable to open file");
    let res = match file.read_to_string(&mut file_content) {
        Ok(content) => content,
        Err(why) => panic!("{}",why),
    };

    return file_content;
}

如果请求的文件是基于文本的,这将起作用,但是当我想读取图像时,会收到以下消息:

流没有包含有效的UTF-8

这是什么意思,以及如何解决?

解决方案

String 的文档将其描述为:

UTF-8编码的可增长字符串.

UTF-8的维基百科定义将为您提供大量的背景知识那是什么简短的版本是计算机使用称为 byte 的单位来表示数据.不幸的是,这些用字节表示的数据块没有内在的含义;必须从外部提供. UTF-8是一种解释字节序列的方式, JPEG 之类的文件格式也是如此./p> 与大多数文本编码一样,

UTF-8对有效无效具有特定的要求和字节序列.您尝试加载的任何映像都包含一个字节序列,这些字节序列不能解释为UTF-8字符串;这就是错误消息告诉您的内容.


要对其进行修复,请勿使用String来保存任意字节集合.在Rust中,最好用 Vec 表示:

fn read_file(mut file_name: String) -> Vec<u8> {
    file_name = file_name.replace("/", "");
    if file_name.is_empty() {
        file_name = String::from("index.html");
    }

    let path = Path::new(&file_name);
    if !path.exists() {
        return String::from("Not Found!").into();
    }
    let mut file_content = Vec::new();
    let mut file = File::open(&file_name).expect("Unable to open file");
    file.read_to_end(&mut file_content).expect("Unable to read");
    file_content
}


要传播一点,这就是Rust为何是一种不错的语言的一个重要方面.因为存在一种表示保证是有效的UTF-8字符串的字节集"的类型,所以我们可以编写更安全的程序,因为我们知道此不变性将始终为true.我们不必在整个程序中进行检查以确保"它仍然是字符串.

I'm creating a simple HTTP server. I need to read the requested image and send it to browser. I'm using this code:

fn read_file(mut file_name: String) -> String {
    file_name = file_name.replace("/", "");
    if file_name.is_empty() {
        file_name = String::from("index.html");
    }

    let path = Path::new(&file_name);
    if !path.exists() {
        return String::from("Not Found!");
    }
    let mut file_content = String::new();
    let mut file = File::open(&file_name).expect("Unable to open file");
    let res = match file.read_to_string(&mut file_content) {
        Ok(content) => content,
        Err(why) => panic!("{}",why),
    };

    return file_content;
}

This works if the requested file is text based, but when I want to read an image I get the following message:

stream did not contain valid UTF-8

What does it mean and how to fix it?

解决方案

The documentation for String describes it as:

A UTF-8 encoded, growable string.

The Wikipedia definition of UTF-8 will give you a great deal of background on what that is. The short version is that computers use a unit called a byte to represent data. Unfortunately, these blobs of data represented with bytes have no intrinsic meaning; that has to be provided from outside. UTF-8 is one way of interpreting a sequence of bytes, as are file formats like JPEG.

UTF-8, like most text encodings, has specific requirements and sequences of bytes that are valid and invalid. Whatever image you have tried to load contains a sequence of bytes that cannot be interpreted as a UTF-8 string; this is what the error message is telling you.


To fix it, you should not use a String to hold arbitrary collections of bytes. In Rust, that's better represented by a Vec:

fn read_file(mut file_name: String) -> Vec<u8> {
    file_name = file_name.replace("/", "");
    if file_name.is_empty() {
        file_name = String::from("index.html");
    }

    let path = Path::new(&file_name);
    if !path.exists() {
        return String::from("Not Found!").into();
    }
    let mut file_content = Vec::new();
    let mut file = File::open(&file_name).expect("Unable to open file");
    file.read_to_end(&mut file_content).expect("Unable to read");
    file_content
}


To evangelize a bit, this is a great aspect of why Rust is a nice language. Because there is a type that represents "a set of bytes that is guaranteed to be a valid UTF-8 string", we can write safer programs since we know that this invariant will always be true. We don't have to keep checking throughout our program to "make sure" it's still a string.

这篇关于什么是“流不包含有效的UTF-8"?意思是?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆