Intro:Linux

From PrgmrWiki

Linux is a Unix-like operating system. Like Windows, it controls communication between hardware and software and allows the user to manipulate the system. Many of the flavors of Linux offer a Graphical User Interface, or GUI. For a programmer, it is very important to also be able to work with the system via a Command Line Interface. Here basic Linux commands will be addressed.



Commands 1

At the Linux command line, you will be greeted with a prompt. Often some information is displayed at the prompt, the current working directory, the current shell, among others. These possibilities will be addressed later, but for now, let us consider the simple Cygwin "$_". The dollar sign is followed by a blinking cursor representing the place where the character you typed will be added to the command line.

Good starting commands to know are, ls, cp,and mv. The command ls lists the current contents of the working directory, the current place on disk being accessed by the operating system. Command cp copies a file. mv moves a file from one position to another on disk. Note that the commands consist in the first and third letters of their corresponding verb. This provides a good mnemonic for keeping them straight. This pattern quickly falls by the way side with more advanced work however.

Let's see these commands in action. Suppose you have the files fileA.txt, fileB.txt, myFileA.dat, and myFileB.dat in the directory /home/Rob/data. By the way, to view the current working directory use the command pwd.

Typing ls results in the output:

   fileA.txt
   fileB.txt
   myFileA.dat
   myFileB.dat

ls can also take arguments. For example, ls *.txt will list only files ending with the extension .txt. Here the '*' is called a wildcard. It's like variable representing all possibilities of arrangements of allowable characters, adsfadsf.txt, dsafa23.w323.txt, and sdaf_wik.txt are all matches for *.txt.

Commands usually have 'switches' which slightly alter their behavior. Whereas ls will show the contents of the current working directory, ls -l will also show those files with their size on disk, date of creation, access permissions, and other data. While ls can be passed options which can alter its behavior, it can also be passed arguments, information about what the command should be performed on. So while /home/Rob/data, ls /home/Rob will list the contents of the director /home/Rob while still keeping the current working directory the same.

This illustrates the basic use of a command at the command line: command <options> <arguments> .

Possible options for a command can be found by consulting the associated man page. man ls will open up Linux's built in user's manual for the command ls. These man pages are required reading for acquiring a deep understanding of Linux.

The command cp takes two arguments, file1 and file2. It creates file2 as a file having the same contents of file1.

cp fileA.txt fileG.txt will make a file, fileG.txt, which is a duplicate of fileA.txt. There are several options available which are listed in the man page.

mv fileA.txt fileG.txt on the other hand will move fileA.txt to fileG.txt without making a duplicate. This has the effect of renaming fileA.txt. Had a path been specified with the new filename, the file would be moved to a new direction on disk.

Commands 2

Now more complex commands will be introduced. cd is one of the most useful commands. It changes the current working directory. It takes as an argument the destination directory. So if you are at the directory /home/Rob/data, cd /root will take you to the root directory.

echo will pass its input to "standard out" which is the terminal screen by default. So echo "hello world" will print "hello world" to the screen.

Linux has special "environment variables" which help keep track of current system configurations. Some common variables are PWD, OLDPWD, and PATH. To view the current value stored in an environment variable you would do echo $PWD. The dollar sign specifies that the value of the variable is desired, otherwise, echo will just use its default behavior and print the letters PWD to screen. PWD stores the current working directory, so echo $PWD which prints the value of PWD to screen has the same effect as the command pwd. OLDPWD stores the previous working directory. This is useful when testing having to switch between 'distant' directories for testing purposes.

PATH is one of the most important environment variables. When a command is used at the command line, Linux consults the path to find that command in the path. If it does not find it, then it will give a 'command not found' error. If you use echo $PATH you will usually see mention of the directory /usr/bin. It contains many of the basic programs of Linux, commands are actually just references to these programs.

So far, we have covered moving around the file system, and manipulating files on the level of the directories in which they are stored. There are ways of getting at the insides of a file without manipulating them. cat will print a file to the screen. head will print the first several lines of a file to the screen. tail will print the last several lines of a file to screen. Command line options are for each of these which allows you to tailor them to your needs. One of the most important commands is grep.

grep allows you to search for matching strings within a file (A string is a sequence of characters.) For grep, if the string includes, spaces, the whole string should be included in parentheses. It is very important that a file be specified for grep to look through. Just typing grep and a string will get the machine stuck. This is going to happen at some time or other. When it does, hit ctrl-c. This ends the execution of the current command. So suppose I had a file called animals.txt and it read like this:

    Aardvark
    Bear
    Cat
    Deer
    Elephant
    Fox
    Giraffe
    Horse

'grep vark animals.txt' will return the match "Aardvark".

Perl

Perl is an interpreted language. Perl code must be interpreted each time a program is run. So suppose you have a program myprog.pl, to run it, type:

perl myprog.pl

on the command line. By default, the system looks for the program in the current directory. The exact location can be specified to run a program in another directory.

Let's consider a simple perl program.

  1. !/usr/bin/perl

print "Hello, world!\n";

This will print Hello,world! to the screen. The print command passes whatever is in quotes, the string, to stdout, standard output, which in this case is the screen.

Note the first line starting with, #!. This is called the "she-bang" operator. It tells the computer where to find the perl interpreter, here /usr/bin/perl. The location may be different on your computer. To find it, use the which command.

which perl

and it will return the location.

The semi-colon is also important. It tells the interpreter that the line is ending.

Here is another version of the program that does the same thing.

  1. !/usr/bin/perl

$words="Hello, world!\n"; print $words;

Here, the variable, a container for data, $words is set equal to the string "Hello, world!\n", by what's called an assignment. print takes the contents of words and outputs it to stdout.

The $ is very important. Perl has 3 major data-types. These are scalars, arrays/lists, and hashes. Each has a special symbol which signifies to the interpreter what kind of data is stored in the variable. So, $ precedes the variable name for a scalar, @ for arrays, and % for hashes. The latter two will be discussed shortly. Now, more on scalars.

Numbers can also be stored in variables.

Suppose $a=5 and $b=3.

print $a will print 5 to screen. print $b will print 3 to screen.

print ($a + $b) will print the sum of $a and $b, or 8. So standard algebraic expressions can be represented by perl variables.

The '-' is used for subtraction, '+' for addition, '*' for multiplication, and '/' is used for division. Several other built in functions are also available. The command sin($x) will return the sine of $x with $x in radians.

Strings can also be added together, or concatenated, as well by using the '.' operator. Suppose $a="white" and $b= "christmas". Then this:

print$a.$b;

will print white christmas to screen.

Arrays

Arrays are used to combine a collection of data in the same variable name. Further, that data can be accessed sequentially which has several benefits.

Let's see a sample program:

  1. !/usr/bin/perl

@vars=("raisin", "orange", "pineapple"); print $vars[0];

This program will print the first 'element' of the array @vars. Arrays are 'zero indexed'. The first element has index 0, the second element has index 2, and so on.

Notice when @vars is assigned data it starts with an @, but when the first element is used in print, it starts with $. This is because an array is just a row of scalars. Specifying a particular element is specifying a scalar, so it should be preceded by $.

Just as elements can be accessed, they can also be assigned values.

$vars[0]="apple";

will change the first element of @vars from raisin to apple.

Arrays make more efficient use of memory and allow for easier programming.

Here is a program for adding 5 numbers using all scalars:

  1. !/usr/bin/perl

$a=1; $b=2; $c=3; $d=4; $e=5; $sum=$a+$b+$c+$d+$e; print "total: $sum\n"; This program uses all scalars. Here is a way to do it with an array:

  1. !/usr/bin/perl

@arr=(1,2,3,4,5); $sum=0; foreach $num(@arr){$sum=$sum+$num;} print "total: $sum\n";

The foreach part will be discussed shortly, but for now a few points are noteworthy. The code is more compact. The assignment of values is done in a single line. This can be done compactly using scalars with short variable names, but most of the characters entered have nothing to do with the actual data. Look at the assignment line in the array program: "@arr=(1,2,...)". The bulk of the assignment is precisely the data that needs to be stored.

Arrays also make for more efficient use of system memory. Scalars just grab bytes at random. Arrays are data stored in memory in a row, there is intrinsic organization to it. This improves running speed.

The foreach command above is a control structure. Control structures change the flow of a program. Some code you only want to be run when certain conditions are met, sometimes you want other code to be run when those conditions are not met. Control structures help the program decide what to do in different situations. One example of a control-structure is a loop. It specifies that a chunk of code is to be run multiple times. How many times the code is run depends on "conditionals", true-false statements. foreach is an example of a loop.

the foreach command associates with the variable immediately following it, in this case $num, a value from the array specified in parentheses (@arr). At each step of the loop the value of $num sequentially takes on the values stored in @arr. So here's what the second, array-based program does.

First values are assigned to @arr. Then $sum is initialized to 0. Finally the loop is begun.

First $num is equal to the first value stored in $arr, 1. Next the current value of $num is added to the current value of $sum. Then the next step is begun. $num is set equal to the value of the second element of @arr, so $num=2. Then the now current value of $num, 2, is added to $sum. And so on until all the values of the array have been used.