Splitting a string with a character in Bash shell

To split a string with a multi-character delimiter in bash, you can use a solution that works at least back as far as bash-2.05b. Bash-3.1 offers slightly more flexibility and enables you to avoid falling off the end of the string. This comes in handy if you’re doing something other than printing the characters, such as creating an array. To split the string, you can use any arbitrary character or some common character pairs that are also supported. If you want to preserve empty elements using the -a flag, you need to quote them. Otherwise, quoting makes no difference.


Solution 1:

Try

echo "abcdefg" | fold -w1

An additional refined approach, as recommended in the comments, has been included in the edit.

echo "abcdefg" | grep -o .


Solution 2:


It’s unnecessary to convert the array as you are able to access each letter individually.

$ foo="bar"
$ echo ${foo:0:1}
b
$ echo ${foo:1:1}
a
$ echo ${foo:2:1}
r

Additionally, an alternative option could be to utilize something similar to this.

$ bar=($(echo $foo|sed  's/(.)/1 /g'))
$ echo ${bar[1]}
a

If the hashtag

sed

is not applicable, the initial method can be combined with a while loop that utilizes the length of the original string (

${#foo}

) for constructing the array.

Please be advised that the following code may not function properly when the string includes whitespace. It is suggested that Vaughn Cato’s solution may be more reliable when dealing with special characters.

thing=($(i=0; while [ $i -lt ${#foo} ] ; do echo ${foo:$i:1} ; i=$((i+1)) ; done))


Solution 3:


Instead of using a for/while loop to iterate over

0 .. ${#string}-1

, there are two different bash methods available:

=~

and

printf

. Though a third option is available using

eval

and a

{..}

sequence expression, it may not be the most clear choice.

When NLS is enabled in bash and the environment is set correctly, the tools will function as intended with non-ASCII characters. This eliminates the possibility of failure with older system tools that may be a concern, such as

sed

. These tools have been compatible with bash since its release in 2005, starting from version 3.0.

In just one expression, convert a string to an array by utilizing

=~

and
regular expressions, converting
.

string="wonkabars"
[[ "$string" =~ ${string//?/(.)} ]]       # splits into array
printf "%sn" "${BASH_REMATCH[@]:1}"      # loop free: reuse fmtstr
declare -a arr=( "${BASH_REMATCH[@]:1}" ) # copy array for later

To execute the process,

string

is expanded by substituting each character with

(.)

. The resulting regular expression is then matched with grouping to capture each character into

BASH_REMATCH[]

. The first index of the array is set to the entire string which cannot be removed since the array is read-only. Additionally,

:1

is used to skip over index 0 when expanding the array, if necessary. Testing has shown that for strings greater than 64 characters, this method is significantly faster than using bash string and array operations.

The aforementioned approach is compatible with newline-containing strings and employs POSIX extended regular expressions. The pattern matching expression by default matches everything except for NUL, and does not include the period. It should be noted that the behavior of POSIX text processing tools may differ in this regard, as it is typically not the default.

Another choice is to utilize the code

printf

.

string="wonkabars"
ii=0
while printf "%s%n" "${string:ii++:1}" xx; do 
  ((xx)) && printf "n" || break
done 

The loop increases the index denoted by

ii

to output a single character at a time. It terminates when there are no more characters to print. The process could be simpler if bash returned the number of characters printed (like in C) instead of an error status. However, the number of characters printed can still be obtained using

%n

and stored in

xx

. This method has been effective since bash-2.05b.

Using

printf -v var

with bash-3.1 provides increased flexibility, allowing you to prevent unintended string truncation when performing tasks other than character printing. For instance, it enables you to create arrays with ease.

declare -a arr
ii=0
while printf -v cc "%s%n" "${string:(ii++):1}" xx; do 
    ((xx)) && arr+=("$cc") || break
done

Frequently Asked Questions