JIYIK CN >

Current Location:Home > Learning > OPERATING SYSTEM >

Linux split command split

Author:JIYIK Last Updated:2025/04/08 Views:

The split command, as a member of the pipeline command family, is used to split a large file into many small files. Sometimes, in order to improve readability, it is necessary to split a file into smaller fragments.

Let's take a look at the usage of split

$ split [OPTION] [INPUT [PREFIX]]

By default, split writes the input file to the output file in units of 1000 lines. In addition, the output file is prefixed with x by default, and the file name is aa, ab, ... in sequence. And the output file directory is the current directory.

$ cd /tmp
$ split /etc/termcap
$ ll –k
-rw-r--r-- 1 root root 45 Sep 30 19:25 xaa
-rw-r--r-- 1 root root 43 Sep 30 19:25 xab
-rw-r--r-- 1 root root 44 Sep 30 19:25 xac
-rw-r--r-- 1 root root 44 Sep 30 19:25 xad
-rw-r--r-- 1 root root 37 Sep 30 19:25 xae
-rw-r--r-- 1 root root 38 Sep 30 19:25 xaf
-rw-r--r-- 1 root root 42 Sep 30 19:25 xag
-rw-r--r-- 1 root root 39 Sep 30 19:25 xah
-rw-r--r-- 1 root root 45 Sep 30 19:25 xai
-rw-r--r-- 1 root root 40 Sep 30 19:25 xaj
-rw-r--r-- 1 root root 41 Sep 30 19:25 xak
-rw-r--r-- 1 root root 40 Sep 30 19:25 xal
-rw-r--r-- 1 root root 39 Sep 30 19:25 xam
-rw-r--r-- 1 root root 44 Sep 30 19:25 xan
-rw-r--r-- 1 root root 41 Sep 30 19:25 xao
-rw-r--r-- 1 root root 41 Sep 30 19:25 xap
-rw-r--r-- 1 root root 41 Sep 30 19:25 xaq
-rw-r--r-- 1 root root 48 Sep 30 19:25 xar
-rw-r--r-- 1 root root 41 Sep 30 19:25 xas
-rw-r--r-- 1 root root  3 Sep 30 19:25 xat

By default, the file names are prefixed with x, and the file names are aa, ab, ac, .... And except for the last file xat, the other files contain the contents of 1000 lines in /etc/termcap.

For the output file prefix, we can specify what we want

$ split /etc/termcap termcap
-rw-r--r-- 1 root root 45 Sep 30 19:28 termcapaa
-rw-r--r-- 1 root root 43 Sep 30 19:28 termcapab
-rw-r--r-- 1 root root 44 Sep 30 19:28 termcapac
-rw-r--r-- 1 root root 44 Sep 30 19:28 termcapad
-rw-r--r-- 1 root root 37 Sep 30 19:28 termcapae
-rw-r--r-- 1 root root 38 Sep 30 19:28 termcapaf
-rw-r--r-- 1 root root 42 Sep 30 19:28 termcapag
-rw-r--r-- 1 root root 39 Sep 30 19:28 termcapah
-rw-r--r-- 1 root root 45 Sep 30 19:28 termcapai
……

Let's see if the prefix has been changed to the termcap we specified. Yes, specifying a prefix is ​​as simple as that - just follow it with the prefix we want.

Earlier we specified the prefix of the output file. Here we will see how to change the length of the file name, that is, the suffix (the default length is 2 - aa, ab, ...).

-a LENGTH specifies the output file suffix to be LENGTH. The default length is 2.

$ split –a 3 /etc/termcap termcap
$ ll –k
total 908
-rw-r--r-- 1 root root 45 Sep 30 19:30 termcapaaa
-rw-r--r-- 1 root root 43 Sep 30 19:30 termcapaab
-rw-r--r-- 1 root root 44 Sep 30 19:30 termcapaac
-rw-r--r-- 1 root root 44 Sep 30 19:30 termcapaad
-rw-r--r-- 1 root root 37 Sep 30 19:30 termcapaae
……

Let's see if the length becomes 3.

-l LINES (lowercase l instead of uppercase L) specifies the number of lines to write to each output file. The default is to split the input file into units of 1000 lines.

$ split –l 2000 /etc/termcap termcap
$ ll –k
-rw-r--r-- 1 root root 88 Sep 30 19:36 termcapaa
-rw-r--r-- 1 root root 88 Sep 30 19:36 termcapab
-rw-r--r-- 1 root root 74 Sep 30 19:36 termcapac
-rw-r--r-- 1 root root 81 Sep 30 19:36 termcapad
-rw-r--r-- 1 root root 85 Sep 30 19:36 termcapae
……

We can see that the file size is almost doubled compared to before, because we previously split it into 1000 lines by default, and now it is 2000 lines.

In the above, we split the input files by line units. Now we will split them by file size.

-b SIZE can be followed by the size of the file to be split, and can be added with units, such as b, k, m, etc.

$ ll –k /etc/termcap
-rw-r--r-- 1 root root 789 Jan  7  2007 /etc/termcap
//我们看/etc/termcap文件大小有789k,这里我们指定按照200k分割该文件
$ split –b 200k /etc/termcap termcap
$ ll –k
-rw-r--r-- 1 root root 200 Sep 30 19:45 termcapaa
-rw-r--r-- 1 root root 200 Sep 30 19:45 termcapab
-rw-r--r-- 1 root root 200 Sep 30 19:45 termcapac
-rw-r--r-- 1 root root 189 Sep 30 19:45 termcapad

Let's see if the result is the same as what we expected. The sum of the sizes of the four files is the same as the size of /etc/termcap.

-C size means writing as many lines of the input file as possible into the output file within the range specified by SIZE. If the output file size exceeds the size specified by -C when adding a line, the line will be discarded and written to the next output file. In another case, if the size of a line exceeds the size specified by -C, the line will be divided into several parts of the size specified by -C and written to the output file separately (for example, if there is a line of 20 characters and the -C size is 6 characters, the line will be divided into 4 parts and written to four files. Although the fourth file has only 2 characters, the next line will start from the fifth file). The size after size is the same as the -b option.

$ cat /tmp/split
onmpw
wwwcom
jiyinet
blogorg
$ split –C 14 /tmp/split split
$ ls
split aa  split ab  split ac //三个文件
$ cat split aa
onmpw
wwwcom
$ cat split ab
jiyinet
$ cat split ac
blogorg

After seeing the result, we should have a clear understanding of -C introduced above. Next, we specify -C 3. According to what we introduced above, there should be 11 output files (carriage return character is also counted as one character). Let's verify it.

$ split –C 3 /tmp/split split
$ ll
-rw-r--r-- 1 root root  3 Sep 30 20:26 splitaa
-rw-r--r-- 1 root root  1 Sep 30 20:26 splitab
-rw-r--r-- 1 root root  3 Sep 30 20:26 splitac
-rw-r--r-- 1 root root  3 Sep 30 20:26 splitad
-rw-r--r-- 1 root root  1 Sep 30 20:26 splitae
-rw-r--r-- 1 root root  3 Sep 30 20:26 splitaf
-rw-r--r-- 1 root root  3 Sep 30 20:26 splitag
-rw-r--r-- 1 root root  2 Sep 30 20:26 splitah
-rw-r--r-- 1 root root  3 Sep 30 20:26 splitai
-rw-r--r-- 1 root root  3 Sep 30 20:26 splitaj
-rw-r--r-- 1 root root  2 Sep 30 20:26 splitak

After counting, there are indeed 11 files. Interested students can view the contents of each file separately to deepen their understanding of split -C.

-d uses numbers as suffixes. By default, lowercase letters are used as suffixes.

$ split –db 300k /tmp/termcap termcap
-rw-r--r-- 1 root root 307200 Sep 30 20:28 termcap00
-rw-r--r-- 1 root root 307200 Sep 30 20:28 termcap01
-rw-r--r-- 1 root root 192703 Sep 30 20:28 termcap02

Let’s see if the suffix has changed.

Well, let's see if splitting files in Linux is very simple. One split command can do it. That's all about split. You can use info split to view more detailed instructions.

I hope this article is helpful to you.

For reprinting, please send an email to 1244347461@qq.com for approval. After obtaining the author's consent, kindly include the source as a link.

Article URL:

Related Articles

Restart PostgreSQL in Ubuntu 18.04

Publish Date:2025/04/09 Views:72 Category:PostgreSQL

This short article shows how to restart PostgreSQL in Ubuntu. Restart PostgreSQL Server in Ubuntu You can restart Postgres server in Ubuntu using the following command. Order: sudo service postgres restart Sometimes the above command does n

Issues to note when installing Apache on Linux

Publish Date:2025/04/08 Views:78 Category:OPERATING SYSTEM

As the most commonly used web server, Apache can be used in most computer operating systems. As a free and open source Unix-like operating system, Linux and Apache are a golden pair. This article will introduce the installation and use of A

How to decompress x.tar.xz format files under Linux

Publish Date:2025/04/08 Views:186 Category:OPERATING SYSTEM

A lot of software found today is in the tar.xz format, which is a lossless data compression file format that uses the LZMA compression algorithm. Like gzip and bzip2, it supports multiple file compression, but the convention is not to compr

Summary of vim common commands

Publish Date:2025/04/08 Views:115 Category:OPERATING SYSTEM

In Linux, the best editor should be vim. However, the complex commands behind vim's powerful functions also make us daunted. Of course, these commands do not need to be memorized by rote. As long as you practice using vim more, you can reme

Detailed explanation of command return value $? in Linux

Publish Date:2025/04/08 Views:58 Category:OPERATING SYSTEM

? is a special variable. This variable represents the return value of the previous command. That is to say, when we run certain commands, these commands will return a code after running. Generally, if the command is successfully run, the re

Common judgment formulas for Linux script shell

Publish Date:2025/04/08 Views:159 Category:OPERATING SYSTEM

In shell script programming, predicates are often used. There are two ways to use predicates, one is to use test, and the other is to use []. Let's take a look at how to use these two methods through two simple examples. Example 1 # test –

Scan to Read All Tech Tutorials

Social Media
  • https://www.github.com/onmpw
  • qq:1244347461

Recommended

Tags

Scan the Code
Easier Access Tutorial