Determine the end-of-line format, tabs, bom, and nul characters
- For help, run
chars -h
chars v2.6.0
Determine the end-of-line format, tabs, bom, and nul
https://github.com/jftuga/chars
Usage:
chars [filename or file-glob 1] [filename or file-glob 2] ...
-F when used with -f, only display a list of failed files, one per line
-b examine binary files
-c add comma thousands separator to numeric values
-e string
exclude based on regular expression; use .* instead of *
-f string
fail with OS exit code=100 if any of the included characters exist; ex: -f crlf,nul,bom8,nonascii
-j output results in JSON format; can't be used with -l; does not honor -t or -c
-l int
shorten files names to a maximum of this length
-s string
sort output by column: filename, crlf, lf, tab, nul, bom8, bom16, nonascii, bytesread (default "filename")
-t append a row which includes a total for each column
-v display version and then exit
Notes:
Use - to read a file from STDIN
On Windows, try: chars * -or- chars */* -or- chars */*/*
- macOS:
brew update; brew install jftuga/tap/chars - Binaries for Linux, macOS and Windows are provided in the releases section.
- Run
charswith no additional cmd-line switches -
- Only report files in the current directory
-
- Report text files only since
-bis not used
- Report text files only since
PS C:\chars> .\chars.exe *
+-----------------+------+-----+-----+------+------+-------+-----------+-----------+
| FILENAME | CRLF | LF | TAB | NUL | BOM8 | BOM16 | NON-ASCII | BYTESREAD |
+-----------------+------+-----+-----+------+------+-------+-----------+-----------+
| .goreleaser.yml | 0 | 59 | 0 | 0 | 0 | 0 | 0 | 1066 |
| LICENSE | 0 | 21 | 0 | 0 | 0 | 0 | 0 | 1068 |
| README.md | 0 | 92 | 0 | 0 | 0 | 0 | 0 | 3510 |
| chars.go | 0 | 246 | 328 | 0 | 0 | 0 | 0 | 6477 |
| go.mod | 0 | 10 | 2 | 0 | 0 | 0 | 0 | 188 |
| go.sum | 0 | 6 | 0 | 0 | 0 | 0 | 0 | 533 |
| testfile1 | 0 | 22 | 0 | 3223 | 0 | 1 | 27 | 6448 |
+-----------------+------+-----+-----+------+------+-------+-----------+-----------+
- Run
charswith-eand-lcmd-line switches -
- Only report files starting with
pin theC:\Windows\System32directory
- Only report files starting with
-
- Exclude all files matching
perf.*dat
- Exclude all files matching
-
- Shorten filenames to a maximum length of
32
- Shorten filenames to a maximum length of
PS C:\chars> .\chars.exe -e perf.*dat -l 32 C:\Windows\System32\p*
+----------------------------------+------+----+-----+------+------+-------+-----------+-----------+
| FILENAME | CRLF | LF | TAB | NUL | BOM8 | BOM16 | NON-ASCII | BYTESREAD |
+----------------------------------+------+----+-----+------+------+-------+-----------+-----------+
| C:\Windows\System32\pcl.sep | 11 | 0 | 0 | 0 | 0 | 0 | 0 | 150 |
| C:\Windows\System32\perfmon.msc | 1933 | 0 | 0 | 0 | 0 | 0 | 0 | 145519 |
| C:\Windows\Sys...tmanagement.msc | 1945 | 0 | 0 | 0 | 0 | 0 | 0 | 146389 |
| C:\Windows\System32\pscript.sep | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 51 |
| C:\Windows\Sys...eryprovider.mof | 0 | 61 | 0 | 2073 | 0 | 1 | 987 | 4148 |
+----------------------------------+------+----+-----+------+------+-------+-----------+-----------+
- Pipe STDIN to
chars - Use JSON output, with
-j
$ curl -s https://example.com/ | chars -j[
{
"filename": "STDIN",
"crlf": 0,
"lf": 46,
"tab": 0,
"bom8": 0,
"bom16": 0,
"nul": 0,
"nonAscii": 0,
"bytesRead": 1256
}
]- Fail when certain characters are detected, with
-f -
- OS exit code on a
-ffailure is always100
- OS exit code on a
-
-fis a comma-delimited list containing:crlf,lf,tab,nul,bom8,bom16
$ chars -f lf,tab /etc/group ; echo $?
+------------+------+----+-----+-----+------+-------+-----------+-----------+
| FILENAME | CRLF | LF | TAB | NUL | BOM8 | BOM16 | NON-ASCII | BYTESREAD |
+------------+------+----+-----+-----+------+-------+-----------+-----------+
| /etc/group | 0 | 58 | 0 | 0 | 0 | 0 | 0 | 795 |
+------------+------+----+-----+-----+------+-------+-----------+-----------+
100- Fail when certain characters are detected, with
-f - Only output failed file names, with
-F
$ chars -f lf,tab -F /etc/gr* ; echo $?
/etc/group
/etc/group.bak
100- Output to JSON, with
-j - Use
-eto exclude and filenames starting withgo, such asgo.modandgo.sum - Use
jqto output toCSVcontaining two columns:filename,tab -
- Only include files that contain
tabcharacters
- Only include files that contain
$ chars -e '^go' -j * | jq -r '.[] | select(.tab > 0) | [.filename,.tab] | @csv'
"case.go",80
"chars.go",475- Output totals, with
-t - Output commas in numeric values, with
-c - Exclude files containing
.g*, with-e
PS C:\chars> .\chars.exe -t -c -e "\.g.*" *
+-----------------+------+-----+-----+-----+------+-------+-----------+-----------+
| FILENAME | CRLF | LF | TAB | NUL | BOM8 | BOM16 | NON-ASCII | BYTESREAD |
+-----------------+------+-----+-----+-----+------+-------+-----------+-----------+
| LICENSE | 0 | 21 | 0 | 0 | 0 | 0 | 0 | 1,068 |
| README.md | 0 | 178 | 4 | 0 | 0 | 0 | 0 | 6,656 |
| STATUS.md | 0 | 50 | 0 | 0 | 0 | 0 | 0 | 3,055 |
| go.mod | 0 | 11 | 3 | 0 | 0 | 0 | 0 | 214 |
| go.sum | 0 | 9 | 0 | 0 | 0 | 0 | 0 | 795 |
| TOTALS: 5 files | 0 | 269 | 7 | 0 | 0 | 0 | 0 | 11,788 |
+-----------------+------+-----+-----+-----+------+-------+-----------+-----------+
- YMMV when piping to
STDINunder Windows -
- Under
cmd, instead oftype input.txt | chars, use<redirection when possible:chars < input.txt
- Under
-
- Under a recent version of
powershell, useGet-Content -AsByteStream input.txt | charsinstead of justGet-Content input.txt | chars
- Under a recent version of
cmdandpowershellwill skipBOMcharacters; these 2 fields will both report a value of0cmdandpowershellwill skipNULcharacters; this field report a value of0cmdwill convertLFtoCRLFforUTF-16encoded filespowershellwill convertLFtoCRLF- Piping from programs such as
curlwill returnLFcharacters undercmd, butCRLFunderpowershell -
- Under powershell, consider using
curl --output
- Under powershell, consider using
- Case folding on Windows is somewhat implemented in case.go.
-
- This programs attempts case-insensitive filename matching since this is the expected behavior on Windows.
-
- It is hard-coded to
English.
- It is hard-coded to
- Newline -
CRLFvsLF - Tab key
- Null character
- Byte order mark -
BOM-8vsBOM-16
- ellipsis - Go module to insert an ellipsis into the middle of a long string to shorten it
- tablewriter - ASCII table in golang
- /u/skeeto and /u/petreus provided code review and suggestions