本实战项目需要实现一个简单的类似grep的程序,具体要求是查找存在目标字符的行,并输出。
被查找的文件中的内容:
text.txt
I'm nobody! Who are you?
Are you nobody, too?
Then there's a pair of us - don't tell!
They'd banish us, you know!
How dreary to be somebody!
How public, like a frog
To tell your name the livelong day
To an admiring bog!
1 读取命令行参数
use std::env;
fn main() {
let args: Vec<String> = env::args().collect(); // 返回一个Vector,无法处理非法Unicode字符
let query = &args[1];
let filename = &args[2];
println!("Search for {}", query);
println!("In file {}", filename);
}
2 读取文件内容
use std::env;
use std::fs;
fn main() {
let args: Vec<String> = env::args().collect(); // 返回一个Vector,无法处理非法Unicode字符
let query = &args[1];
let filename = &args[2];
println!("Search for {}", query);
println!("In file {}", filename);
let contents = fs::read_to_string(filename)
.expect("Something went wrong when opening file!");
println!("With text:\n{}", contents);
}
这个程序看似可以完成功能,但是存在不少瑕疵:
- 在
main
函数中存在许多功能,但是合理的做法是将代码拆分为多个函数,一个函数只负责一个功能,这样后期代码量大了易于维护; - 打开文件错误处理不够明确和灵活;
- 查询字段
query
和filename
存在一定的关联,可以放在一个结构体中; - 程序panic时输出的错误信息对用户来说难以理解。
3 重构程序
3.1 改进模块化
提取函数:
use std::env;
use std::fs;
fn main() {
let args: Vec<String> = env::args().collect();
let (query, filename) = parse_config(&args);
println!("Search for {}", query);
println!("In file {}", filename);
let contents = fs::read_to_string(filename)
.expect("Something went wrong when opening file!");
println!("With text:\n{}", contents);
}
fn parse_config(args: &[String]) -> (&str, &str) {
let query = &args[1];
let filename = &args[2];
(query, filename)
}
使用结构体和构造函数:
use std::env;
use std::fs;
struct Config {
query: String,
filename: String,
}
impl Config {
fn new(args: &[String]) -> Config {
let query = args[1].clone();
let filename = args[2].clone();
Config { query, filename }
}
}
fn main() {
let args: Vec<String> = env::args().collect(); // 返回一个Vector,无法处理非法Unicode字符
let config = Config::new(&args);
println!("Search for {}", config.query);
println!("In file {}", config.filename);
let contents =
fs::read_to_string(config.filename).expect("Something went wrong when opening file!");
println!("With text:\n{}", contents);
}
3.2 错误处理
处理参数不足的错误:
impl Config {
fn new(args: &[String]) -> Config {
if args.len() < 3 {
panic!("Not enough arguments");
}
let query = args[1].clone();
let filename = args[2].clone();
Config { query, filename }
}
}
上面的程序虽然可以输出错误信息,但是对于用户来说还有很多信息冗余。
use std::{env, process, fs};
struct Config {
query: String,
filename: String,
}
impl Config {
fn new(args: &[String]) -> Result<Config, &'static str> {
if args.len() < 3 {
return Err("Not enough arguments!");
}
let query = args[1].clone();
let filename = args[2].clone();
Ok(Config { query, filename })
}
}
fn main() {
let args: Vec<String> = env::args().collect(); // 返回一个Vector,无法处理非法Unicode字符
let config = Config::new(&args).unwrap_or_else(
|err| {
println!("Problem parsing arguments: {}", err);
process::exit(1);
}
);
println!("Search for {}", config.query);
println!("In file {}", config.filename);
let contents =
fs::read_to_string(config.filename).expect("Something went wrong when opening file!");
println!("With text:\n{}", contents);
}
处理业务处理的错误信息:
fn main() {
let args: Vec<String> = env::args().collect(); // 返回一个Vector,无法处理非法Unicode字符
let config = Config::new(&args).unwrap_or_else(|err| {
println!("Problem parsing arguments: {}", err);
process::exit(1);
});
println!("Search for {}", config.query);
println!("In file {}", config.filename);
if let Err(e) = my_grep::run(config) {
println!("Application error: {}", e);
process::exit(1);
}
}
3.3 TDD测试驱动开发
lib.rs
use std::error::Error;
use std::fs;
pub struct Config {
pub query: String,
pub filename: String,
}
impl Config {
pub fn new(args: &[String]) -> Result<Config, &'static str> {
if args.len() < 3 {
return Err("Not enough arguments!");
}
let query = args[1].clone();
let filename = args[2].clone();
Ok(Config { query, filename })
}
}
pub fn run(config: Config) -> Result<(), Box<dyn Error>> {
// 返回的错误实现了Error这个trait
let contents = fs::read_to_string(config.filename)?;
for line in search(&config.query, &contents) {
println!("{}", line);
}
Ok(())
}
pub fn search<'a>(query: &str, contents: &'a str) -> Vec<&'a str> {
let mut result = Vec::new();
for line in contents.lines() {
if line.contains(query) {
result.push(line);
}
}
result
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn one_result() {
let query = "duct";
let content = "\
Rust:
safe, fast, productive.
Pick three.";
assert_eq!(vec!["safe, fast, productive."], search(query, content));
}
}
在终端执行命令:cargo run body text.txt
4 使用环境变量
使用环境变量实现大小写搜索:
use std::error::Error;
use std::{env, fs};
pub struct Config {
pub query: String,
pub filename: String,
pub case_sensitive: bool,
}
impl Config {
pub fn new(args: &[String]) -> Result<Config, &'static str> {
if args.len() < 3 {
return Err("Not enough arguments!");
}
let query = args[1].clone();
let filename = args[2].clone();
let case_sensitive = env::var("CASE_INSENSITIVE").is_err(); // 只关心环境变量是否出现
Ok(Config { query, filename, case_sensitive })
}
}
pub fn run(config: Config) -> Result<(), Box<dyn Error>> {
// 返回的错误实现了Error这个trait
let contents = fs::read_to_string(config.filename)?;
let results = if config.case_sensitive {
search(&config.query, &contents)
} else {
search_case_insensitive(&config.query, &contents)
};
for line in results {
println!("{}", line);
}
Ok(())
}
pub fn search<'a>(query: &str, contents: &'a str) -> Vec<&'a str> {
let mut result = Vec::new();
for line in contents.lines() {
if line.contains(query) {
result.push(line);
}
}
result
}
pub fn search_case_insensitive<'a>(query: &str, contents: &'a str) -> Vec<&'a str> {
let mut result = Vec::new();
let query = query.to_lowercase(); // 创建一个新的数据,不会获得所有权
for line in contents.lines() {
if line.to_lowercase().contains(&query) {
result.push(line);
}
}
result
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn case_sensitive() {
let query = "duct";
let content = "\
Rust:
safe, fast, productive.
Duct three.";
assert_eq!(vec!["safe, fast, productive."], search(query, content));
}
#[test]
fn case_insensitive() {
let query = "duct";
let content = "\
Rust:
safe, fast, productive.
Duct three.";
assert_eq!(
vec!["safe, fast, productive.", "Duct three."],
search_case_insensitive(query, content)
);
}
}
博主使用MacOS,所以使用env CASE_INSENSITIVE=1
设置临时环境变量。这个环境变量在执行完这条命令后就会失效。如果希望在当前终端会话中一直保持有效,则可以使用(&&
:仅当前面的命令执行成功后才执行后续的命令):
5 标准输出重定向和标准错误
使用eprintln!
宏将标准错误输出命令行中,而使用>
将标准输出(println!
宏输出的内容)重定向到指定文件中:
fn main() {
let args: Vec<String> = env::args().collect(); // 返回一个Vector,无法处理非法Unicode字符
let config = Config::new(&args).unwrap_or_else(|err| {
eprintln!("Problem parsing arguments: {}", err);
process::exit(1);
});
if let Err(e) = my_grep::run(config) {
eprintln!("Application error: {}", e);
process::exit(1);
}
}
6 使用迭代器和闭包优化
6.1 使用迭代器获取命令行参数
实际上,env::args()
返回的是std::env::Args
类型,这个类型是一个实现了Iterator
trait的类型,所以它就是一个迭代器。所以new
函数可以直接获取这个迭代器作为参数,在new
函数内部通过消耗这个迭代器获取迭代器包含的值的所有权,这样就规避了clone()
的问题。
let config = Config::new(env::args()).unwrap_or_else(|err| {
eprintln!("Problem parsing arguments: {}", err);
process::exit(1);
});
impl Config {
pub fn new(mut args: env::Args) -> Result<Config, &'static str> {
if args.len() < 3 {
return Err("Not enough arguments!");
}
args.next();
let query = match args.next() {
Some(v) => v,
None => {
return Err("Can't get query string!");
}
};
let filename = match args.next() {
Some(v) => v,
None => {
return Err("Can't get file name!");
}
};
let case_sensitive = env::var("CASE_INSENSITIVE").is_err(); // 只关心环境变量是否出现
Ok(Config {
query,
filename,
case_sensitive,
})
}
}
6.2 使用迭代器 + filter + 闭包获取变量
在搜索函数中,contents.lines()
返回的也是一个迭代器,通过迭代器的filter()
方法可以使用一个闭包返回一个满足条件的迭代器,并使用collect()
方法将迭代器转换成一个集合。
pub fn search<'a>(query: &str, contents: &'a str) -> Vec<&'a str> {
// let mut result = Vec::new();
//
// for line in contents.lines() {
// if line.contains(query) {
// result.push(line);
// }
// }
//
// result
contents
.lines()
.filter(|line| line.contains(query))
.collect()
}
pub fn search_case_insensitive<'a>(query: &str, contents: &'a str) -> Vec<&'a str> {
// let mut result = Vec::new();
// let query = query.to_lowercase(); // 创建一个新的数据,不会获得所有权
//
// for line in contents.lines() {
// if line.to_lowercase().contains(&query) {
// result.push(line);
// }
// }
//
// result
contents
.lines()
.filter(|line| line.to_lowercase().contains(&query.to_lowercase())) // contains方法要求传入一个字符串切片
.collect()
}
优化后的的程序源码:
main.rs
use std::{env, process};
use my_grep::Config;
fn main() {
let config = Config::new(env::args()).unwrap_or_else(|err| {
eprintln!("Problem parsing arguments: {}", err);
process::exit(1);
});
if let Err(e) = my_grep::run(config) {
eprintln!("Application error: {}", e);
process::exit(1);
}
}
lib.rs
use std::error::Error;
use std::{env, fs};
pub struct Config {
pub query: String,
pub filename: String,
pub case_sensitive: bool,
}
impl Config {
pub fn new(mut args: env::Args) -> Result<Config, &'static str> {
if args.len() < 3 {
return Err("Not enough arguments!");
}
args.next();
let query = match args.next() {
Some(v) => v,
None => {
return Err("Can't get query string!");
}
};
let filename = match args.next() {
Some(v) => v,
None => {
return Err("Can't get file name!");
}
};
let case_sensitive = env::var("CASE_INSENSITIVE").is_err(); // 只关心环境变量是否出现
Ok(Config {
query,
filename,
case_sensitive,
})
}
}
pub fn run(config: Config) -> Result<(), Box<dyn Error>> {
// 返回的错误实现了Error这个trait
let contents = fs::read_to_string(config.filename)?;
let results = if config.case_sensitive {
search(&config.query, &contents)
} else {
search_case_insensitive(&config.query, &contents)
};
for line in results {
println!("{}", line);
}
Ok(())
}
pub fn search<'a>(query: &str, contents: &'a str) -> Vec<&'a str> {
contents
.lines()
.filter(|line| line.contains(query))
.collect()
}
pub fn search_case_insensitive<'a>(query: &str, contents: &'a str) -> Vec<&'a str> {
contents
.lines()
.filter(|line| line.to_lowercase().contains(&query.to_lowercase())) // contains方法要求传入一个字符串切片
.collect()
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn case_sensitive() {
let query = "duct";
let content = "\
Rust:
safe, fast, productive.
Duct three.";
assert_eq!(vec!["safe, fast, productive."], search(query, content));
}
#[test]
fn case_insensitive() {
let query = "duct";
let content = "\
Rust:
safe, fast, productive.
Duct three.";
assert_eq!(
vec!["safe, fast, productive.", "Duct three."],
search_case_insensitive(query, content)
);
}
}