Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 30 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,43 @@
# rust-cfb

[![Build Status](https://github.com/mdsteele/rust-cfb/actions/workflows/tests.yml/badge.svg)](https://github.com/mdsteele/rust-cfb/actions/workflows/tests.yml)
[![Crates.io](https://img.shields.io/crates/v/cfb.svg)](https://crates.io/crates/cfb)
[![Documentation](https://docs.rs/cfb/badge.svg)](https://docs.rs/cfb)

A Rust library for reading/writing [Compound File Binary](
https://en.wikipedia.org/wiki/Compound_File_Binary_Format) (structured storage)
files. See [MS-CFB](https://msdn.microsoft.com/en-us/library/dd942138.aspx)
files. See [MS-CFB](https://msdn.microsoft.com/en-us/library/dd942138.aspx)
for the format specification.

## License

rust-cfb is made available under the
[MIT License](http://spdx.org/licenses/MIT.html).

## Development

This project uses [Cargo](https://doc.rust-lang.org/cargo/) as its build system
and package manager.

### Tests

```bash
cargo test
```

### Benchmarks

There is a benchmark suite using [Criterion.rs](https://github.com/bheisler/criterion.rs). To run the benchmarks:

```bash
cargo bench
```

For a clean run (reset Criterion stats), delete the Criterion output directory and re-run:

```bash
rm -rf target/criterion
cargo bench
```


41 changes: 31 additions & 10 deletions src/internal/path.rs
Original file line number Diff line number Diff line change
Expand Up @@ -43,17 +43,38 @@ fn cfb_uppercase_char(c: char) -> char {
/// order](https://en.wikipedia.org/wiki/Shortlex_order), rather than
/// dictionary order).
pub fn compare_names(name1: &str, name2: &str) -> Ordering {
match name1.encode_utf16().count().cmp(&name2.encode_utf16().count()) {
// This is actually not 100% correct -- the MS-CFB spec specifies a
// particular way of doing the uppercasing on individual UTF-16 code
// units, along with a list of weird exceptions and corner cases. But
// hopefully this is good enough for 99+% of the time.
Ordering::Equal => {
let n1 = name1.chars().map(cfb_uppercase_char);
let n2 = name2.chars().map(cfb_uppercase_char);
n1.cmp(n2)
// This ASCII fast-path is important for performance.
// We saw a 10x speedup for many small streams when comparing pure ascii names.
// Make sure you run the write benchmark before and after changing this code
// to not introduce regressions.
if name1.is_ascii() && name2.is_ascii() {
match name1.len().cmp(&name2.len()) {
Ordering::Equal => {
for (left, right) in name1.bytes().zip(name2.bytes()) {
let left = left.to_ascii_uppercase();
let right = right.to_ascii_uppercase();
match left.cmp(&right) {
Ordering::Equal => {}
other => return other,
}
}
Ordering::Equal
}
other => other,
}
} else {
match name1.encode_utf16().count().cmp(&name2.encode_utf16().count()) {
// This is actually not 100% correct -- the MS-CFB spec specifies a
// particular way of doing the uppercasing on individual UTF-16 code
// units, along with a list of weird exceptions and corner cases. But
// hopefully this is good enough for 99+% of the time.
Ordering::Equal => {
let n1 = name1.chars().map(cfb_uppercase_char);
let n2 = name2.chars().map(cfb_uppercase_char);
n1.cmp(n2)
}
other => other,
}
other => other,
}
}

Expand Down