course Linux system administration · world domination. - Linus Torvalds Linux started as a hobby...

114
Linux system administration course Spring 2019 1

Transcript of course Linux system administration · world domination. - Linus Torvalds Linux started as a hobby...

Linux system administration course

Spring 2019

1

Linux system administration © Jani Jaakkola 2019

System administratorWikipedia: system administrator, or sysadmin, is a person who is responsible for the upkeep, configuration, and reliable operation of computer systems; especially multi-user computers, such as servers. The system administrator seeks to ensure that the uptime, performance, resources, and security of the computers they manage meet the needs of the users, without exceeding a set budget when doing so.

2

Linux system administration © Jani Jaakkola 2019

System administratorMarcel Gagné:

system administrator

”n. Part magician, part juggler, part technical support analyst, and part bartender/psychoanalyst, the system administrator performs the impossible job of keeping all members of his or her company satisfied by making sure that everything works. This usually includes such things that are completely outside the system administrator’s control, such as telephones, photocopiers, fax machines, heating, air conditioning, and paper shortages in the supply cabinet.”

3

Linux system administration © Jani Jaakkola 2019

Devops● Wikipedia: DevOps (a clipped compound of "development" and "operations") is a

software development methodology that combines software development (Dev) with information technology operations (Ops). The goal of DevOps is to shorten the systems development life cycle while delivering features, fixes, and updates frequently in close alignment with business objectives.

● Software cannot be run in a vacuum: it needs a secure environment with physical hardware, storage, libraries and services.

● Linux administration course is about the tools and methods for maintaining that environment, whether it is a physical machine or a virtual machine or a container.

4

Solving a problem by looking at the source code:

5

Linux system administration © Jani Jaakkola 2019

World domination● I'd like to say that I knew this would happen, that it's all part of the plan for

world domination. - Linus Torvalds● Linux started as a hobby project, for hackers. ● 1994: Linux 1.0 released● 1996: Apache becomes world number one web server● 1998: Server vendors start to officially support Linux● 2010: Steam is ported to Linux● 2016: MS releases an official Linux subsystem for Windows 10.● If Microsoft ever does applications for Linux it means I've won. - Linus

Torvalds

6

Linux system administration © Jani Jaakkola 2019

The course 2019● Lecturer Jani Jaakkola [email protected]● Lectures: Tuesdays and Thursdays from 15.1.2019 to 2.5.2019. ● There will be two large exercise jobs and an exam

○ The exercise jobs and the exam are based on lectures and lecture notes○ The lecture notes will be added to the wiki pages after lectures

● Most things learned will be demonstrated on multiple live Linux installations○ There demonstrations probably won’t be recorded○ There will be multiple demo effects when things suddenly won’t work...

● Course wiki page:○ https://wiki.helsinki.fi/x/0gEiB

● We are going to use Telegram for instant messaging at: https://t.me/tkt_sysadmin

7

Linux system administration © Jani Jaakkola 2019

This is a technical course!● We study the hardware and software required for a working Linux environment

○ Sometimes at a very deep level (I hope)● The following are useful skills to have, but they are not the subject of this

course:○ Customer management and psychology○ Negotiations with vendors○ Communications○ Project management

● This course is not about the internals of the Linux kernel. ○ We use it and configure it, but do not delve into kernel development.

● It is assumed that we all already know how to use Linux.

8

Linux system administration © Jani Jaakkola 2019

Useful links● RedHat Enterprise Linux 7 System administrators guide:

○ https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/system_administrators_guide/index

● StackExchange Unix / Linux:○ https://unix.stackexchange.com/

● Arch Linux Wiki:○ https://wiki.archlinux.org/index.php/Table_of_contents

● Official Ubuntu documentation isn’t very helpful. It is aimed at ordinary users.

9

Linux system administration © Jani Jaakkola 2019

Why you should use Linux?● It costs nothing.● It is open source.● It is stable and fast.

○ At least the kernel is...● It is widely supported by vendors.

○ Yes, even by Microsoft.● There exists a staggering amount of software and tools for development and

system maintenance.○ Standards are very nice. There are so many to choose from.

● Nowadays there is plenty of Linux software of which your humble lecturer knows nothing of...

10

Linux system administration © Jani Jaakkola 2019

Part 1: a Linux system installation● Linux distributions: Ubuntu and RedHat.● Linux system installation.● Software package management: deb packages and rpm packages.● Management of user accounts● The GUI system● The hardware: CPUs, Memory, GPU, PCI, USB, …● File systems and block devices.● Processes and daemons.● System security.

11

Linux system administration © Jani Jaakkola 2019

Part 2: the network● Network interfaces and network drivers● Firewalls: ipchains● Bonding, bridging, routing● OpenVPN● Joining a workstation to a centrally managed network:

○ List of historical stuff that you should avoid (NIS, copying files)○ LDAP: Lightweight Directory Access Protocol○ Kerberos○ Active Directory (AD)

12

Linux system administration © Jani Jaakkola 2019

Part 3: the Linux server● Remote management with Ansible and ssh.● Managing X.509 certificates● HTTP servers: Apache, nginx● Web application development: Java, PHP, Python● Databases: MySql, Postgres● Printing● File sharing: Samba and NFS● Backups

13

Linux system administration © Jani Jaakkola 2019

Part 4: server management and virtualisation● Virtualisation and hypervisors

○ Qemu and KVM○ Libvirt○ Paravirtualisation and virtual machine guest drivers and agents

● There is a separate course for containers!● Partitioning: namespaces and cgroups● The cloud● Serverless computing?

14

Linux system administration © Jani Jaakkola 2019

Exercises● There are no weekly exercises● There are two big exercises

○ Both are likely to take hours of work○ You need to return a study diary for both of the exercises

● 1st exercise is about installing and configuring a Linux system○ You need access to some a physical machine for this. Using a VM is too easy.○ Freshmen laptops are good for this.

● 2nd exercise is about installing and configuring Linux servers○ You will be provided virtual machines for 2nd exercise

● Half of course points come from exercises (30)● You must return the exercises to pass the course● Working in groups is allowed. However, mention it in the study diary.

15

Linux system administration © Jani Jaakkola 2019

Things you should know● In this course almost everything is done using the command line

○ You have to be prepared to Read The Fine Manual○ You need to know how to manipulate files: cp, rm, mv, chmod, chown, df, du, find○ You need to know how to manipulate processes: ps, kill, top○ How to manipulate text: cat, grep, sort○ At least one text editor for terminal: Vi and Emacs are fine

● Shell scripting!○ How to write loops: for i in a b c; do echo $i; done○ Writing functions: f() { echo I am a function; }○ Running subshells and variables: var=$(f)○ Reading stuff: while IFS=: read luser x uid junk; do echo $luser($uid); done <

/etc/passwd

16

Open Source and licenses

GNU, GPL, BSD, Apache, ...

17

Linux system administration © Jani Jaakkola 2019

Open Source● Open Source means that the source code of the software is freely available for

comments and modification.● Open Source is a new name (1998) for a very old concept.

○ Technology and information sharing was invented first, writing was invented later● In research and university environments software source code has always been

shared, just like scientific literature.● Closed source proprietary software came later, when it became evident that

money could be made by selling licenses to use software.● When Unix OS was first invented in the 70’ it was shared as source code.● Free software means that you are licensed to use the software without

payment.● You still need a license to use any software!

18

Linux system administration © Jani Jaakkola 2019

Open Source Software Licenses● Software copyright is protected by international treaties and law.● A copyright is automatically granted to the writer of any software. ● There is no freeware. A license is needed to use any software.● Linux distributions are software too. You need a license to use them.● There are multiple often used OS licenses, with different granted rights and

different objectives.● Different Linux distributions have different policies regarding licenses.

19

Linux system administration © Jani Jaakkola 2019

BSD Licenses● BSD license was invented for Berkeley Software Distribution

○ The Unix OS written in the University of Berkeley● Allows redistribution of the software in source and binary form● It allows proprietary software to be derived from BSD source code● The permissive licensing of BSD source code allowed it to be used in

proprietary projects○ SunOS and Solaris○ MacOS Darwin kernel is based on BSD Unix source code.○ Even Windows.

● There used to be lots of BSD derived user space tools in Linux distributions.● Apache and MIT licenses are slight variations in the same theme.

20

Linux system administration © Jani Jaakkola 2019

MIT LicensePermission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

21

Linux system administration © Jani Jaakkola 2019

Gnu General Public (GPL) License● The Problem: MIT and BSD licenses allow closed source derivatives.

○ You can make money with the closed source derivatives.○ There exists an incentive to not return derived code back to the community.

● GPL license requires that if you redistribute any derived GPL code, you must redistribute the derived source code too with the same GPL license.

● GPL is viral. If you use GPL code in your code, your code must be GPL too.● GPL has survived the test of time!● Linux kernel and most OS software in Linux distributions use the GPL license. ● Linux is still only one project, even when multiple competing commercial

entities contribute to it. ● Library GPL (LGPL) allow using GPL licensed libraries in non GPL projects.

22

Linux: the operating system

Distributions and components

23

Linux system administration © Jani Jaakkola 2019

Linux: the operating system● Linux kernel: just the core.

○ The lowest level of the system, implementing hardware drivers and hardware resource management

● Low level user space libraries: glibc, libstdc++● System daemons:

○ Systemd, udev, sssd, sshd, display manager● Text based consoles, straight from the ‘70● Graphics: X Windows ‘80, Wayland maybe in the future● GUI: Gnome, KDE, Cinnamon, Mate● Applications

24

Linux system administration © Jani Jaakkola 2019

The Linux kernel● Monolithic core of the OS:

○ Memory manager○ Process and thread scheduler○ Hardware drivers

■ Including virtual hardware: block devices, virtual network bridges, virtual machines○ File systems

● Kernel constitutes of core and modules○ Core is loaded and started at boot time and it does not change after boot○ Modules are dynamically loaded at run time to implement hardware drivers and functionality

● Linux distribution provides a tested and hopefully stable kernel○ It is still possible to build your own

25

Linux system administration © Jani Jaakkola 2019

Linux distributions● The open source Linux OS consists of thousands of separately developed pieces

of software● Linux distribution is the collection of this software, where all the OS pieces are

packaged and configured to work together in a easily installable OS distribution● A distribution provides

○ OS installer○ A GUI○ A collection of default software○ System for managing software updates○ A database of installed software○ A software repository from where additional software may be installed

26

Linux system administration © Jani Jaakkola 2019

Red Hat Enterprise Linux (RHEL)● One of the oldest distributions. Available since 1994.● Currently the “industry standard” distribution.● Current stable RedHat 7.0 was released 2014

○ 4 years is a very long time in the open source world○ Latest release RedHat 7.6 contains mainly bug fixes

● All parts of the installation are strictly open source.○ However, you need a license for download and support.

● We are going to use RHEL 8.0 beta during the course● RedHat the company

○ Was recently sold to IBM for $34 billion○ The company makes it income from support and training contracts, not from software licensing

● CentOS is the freely installable version of Red Hat27

Linux system administration © Jani Jaakkola 2019

Fedora Linux● Non commercial development project, which is officially supported by Red Hat● No official support● Fast development cycle

○ two releases every year● Red Hat 7 was based on Fedora 19● Red Hat 8 will be based on Fedora 28● Uses the Red Hat Package management system (RPM).

28

Linux system administration © Jani Jaakkola 2019

Debian/GNU Linux● The GNU project was created by Richard Stallman to implement a GPL

licensed Unix from scratch○ Gcc compiler and the Linux C-library came from the GNU project.

● Debian project is a completely community based Linux distribution, where software packaging and maintenance is done by volunteers.

● Debian is the closest thing to the OSS ideal. ● Three versions: Stable, Testing and Unstable.● Debian Stable 9.0 (Stretch) was released June 17th 2017.● Debian project does not provide official support.● Uses the Debian package management system (.deb packages).

29

Linux system administration © Jani Jaakkola 2019

Ubuntu Linux● Commercial version of Debian Linux maintained by the company Canonical.● Based on Debian distribution: “Debian made right”● Stable debian releases are very infrequent: Ubuntu provides two releases every

year● Ubuntu LTS releases, which are supported longer, are released every two years

○ Latest is Ubuntu 18.04 LTS.● Ubuntu has a more user friendly installation and GUI environment.● The set of default software is better integrated into a consistent desktop

○ The default desktop used to be Unity, but Unity development was stopped in 2017.● Ubuntu 18.04 LTS will be used during the course.

30

Linux system administration © Jani Jaakkola 2019

SuSe● German Linux distribution● Uses RPM package format● At least used to be better at localization

31

Linux system administration © Jani Jaakkola 2019

OpenWRT● OpenWrt is an open source project for embedded operating system based on

Linux, primarily used on embedded devices to route network traffic (wikipedia).● If you have a WiFi access point, it is likely that the access point firmware is

based on OpenWRT.● OpenWRT was one of the first successful Linux distributions sold to

consumers.

32

Linux system administration © Jani Jaakkola 2019

Embedded Linuxes● Linux runs on all 32-bit hardware implementing a memory management unit

(MMU).● Android phone operating system is based on Linux.● Nokia’s Maemo and Meego phone OS were based on Linux.

○ Jolla and Sailfish projects still continue.● Samsung Tizen is a embedded Linux OS in Samsung phones, smartwatches and

TVs● Busybox is a software project to provide minimal bootable Linux system.● We don’t cover embedded Linux systems in this course.

33

Linux system administration © Jani Jaakkola 2019

Installing Linux● A boot media is needed

○ Typically a CD-ROM or DVD-ROM image (*.iso file)○ USB boot images can be created from .iso file○ Alternatively network boot image ○ Or a ready made virtual machine image

● Select installation disk○ Partitioning and file system selection is typically an advanced task

● Create encryption passwords and user accounts● Select locales● Select software● … done

34

Linux: filesystem layout

35

Linux system administration © Jani Jaakkola 2019

Linux Standards Base (LSB)● LSB is a joint project by several Linux distributions under the organizational

structure of the Linux Foundation to standardize the software system structure, including the filesystem hierarchy (wikipedia).

● standard libraries● a number of commands and utilities that extend the POSIX standard● the layout of the file system hierarchy

○ File Hierarchy Standard (FHS)● https://wiki.linuxfoundation.org/lsb/start

36

Linux system administration © Jani Jaakkola 2019

Filesystem Hierarchy Standard● The Filesystem Hierarchy Standard (FHS) defines the directory structure and

directory contents in Linux distributions. It is maintained by the Linux Foundation. The latest version is 3.0, released on 3 June 2015 (wikipedia)

● Describes what should be under the filesystem root directory /● FHS isn’t actually very useful for modern use:

○ Software is installed all over the file system depending on file types, instead of just one directory○ It enables to use different (even read only) file systems for different directories to save space. ○ Or even share files between different CPU architectures, but not binaries

37

Linux system administration © Jani Jaakkola 2019

Executable binaries and libraries● In the default PATH:● System executable binaries are installed in /bin and /usr/bin● Locally installed binaries in /usr/local/bin● Binaries only useful for root: /sbin, /usr/sbin

● In the default library path:● System libraries in /lib, /usr/lib and /lib64

38

Linux system administration © Jani Jaakkola 2019

System configuration● System configuration should be in /etc● Hostname: /etc/hostname● Users and groups: /etc/passwd, /etc/shadow, /etc/group● X Windows system: /etc/X11

39

Linux system administration © Jani Jaakkola 2019

Changing data: /var● Software package manager data● System logs● Default place for databases● Libvirtd virtual machines● Spools for printing and email

40

Linux system administration © Jani Jaakkola 2019

Userspace daemon status: /run ● Very early in the boot processes created filesystem in RAM● Contains system state, which is not preserved across reboots● Used to be /var/run

41

Linux system administration © Jani Jaakkola 2019

/boot, /media, /tmp, /var/tmp ...● /boot contains files that are required during system start● The files in /boot might need to be accessed by the bootloader, even before

Linux kernel has started.● When the rest of the Linux system installation is encrypted, the /boot still

needs to be available● /tmp and /var/tmp are for temporary files.

○ /var/tmp should be preserved of reboots● /media is the place where removable file systems (USB, DVD) are mounted● /mnt is for temporary mounts

42

Linux system administration © Jani Jaakkola 2019

Access to hardware: /dev● The UNIX philosophy is “Everything is a file”

○ This actually isn’t true in Linux (or in other Unixes)● Most, but not all, devices can be accessed by files under /dev● Disk drives, physical memory, terminals, GPU’s

○ Serial ports● Command df lists the mounted filesystems and the block devices under them

43

Linux system administration © Jani Jaakkola 2019

Kernel status: /proc● Proc is a virtual file system generated by the Linux kernel on the fly● It contains information about system state● Tools like ps, top, lsof use this filesystem to gather information● /proc/loadavg system load average● /proc/mounts mounted filesystems● /proc/vmstat memory statistics● /proc/[0-9]+ subdirectory for every process● /proc/[0-9]/fd/ entry for every open file handle of every process● /proc/sys/ modifiable kernel settings

44

Linux system administration © Jani Jaakkola 2019

Kernels view of the system: /sys● Since kernel 2.6 there have been two different kernel state file systems● Kernel maintains an internal tree of kernel modules, system drivers and

subsystems● This tree is available from /sys directory● In contains subdirectories for every device class● It allows changing device settings from user space, like device power usage

45

The command line

Terminals, the shell and remote shells

46

Linux system administration © Jani Jaakkola 2019

Terminal emulators● In the old days (1978), Digital Equipment Corporation developed and sold

serial terminals, where the terminal was connected to a centrally managed mainframe either with a serial wire or a modem.

● Text position and text attributes were controlled with ANSI escape codes:○ Clear screen: printf '\033[H\033[2J'

● The terminals were huge commercial success and Linux terminal emulators are still compatible with it.

○ All terminals still have a tty device attached to them, which contains all the settings of a real serial port

● Gnome terminal, xterm and the Linux kernel console are terminal emulators

47

Linux system administration © Jani Jaakkola 2019

The shell● The shell is the program which is (typically) first started on a terminal emulator

when the emulator starts.● The /bin/sh programming syntax comes from a standard● In University’s Cubbli Linux the default unchangeable shell is GNU /bin/bash

○ Bourne Again shell● There are plenty of other shells…

○ Ksh, Korn shell○ Zsh○ The unspeakable horror of csh and tcsh○ Even Windows powershell

48

Linux system administration © Jani Jaakkola 2019

Kernel virtual console● The virtual console is a terminal emulator implemented in the Linux kernel

○ The VT-100 emulator of the kernel written by Torvalds was the fastest most reliable of its time...

● It is available also when no GUI environment is running● Still relevant today when running servers● In used to be implemented as a VGA text mode

○ There used to be multiple VGA text modes, which could be changed on the fly.○ Now implemented with kernel built in frame buffer drivers for GPUs

● There are multiple virtual consoles, changed with keyboard shortcut ○ CTRL+ALT+F1 , CTRL+ALT+F2

49

Linux system administration © Jani Jaakkola 2019

SSH - Secure Shell● When TCP/IP was remote shell access was an obvious application.

○ Telnet and Rlogin protocols created, but neither of them provides any encryption● In 1994 Tatu Ylönen implemented an encrypted terminal session protocol SSH.

○ Later the SSH protocol version 2 was developed as RFC standard.● SSH protocol encryption provides:

○ Server identity validation with private/public keys○ Client identity validation with passwords and private/public keys○ Kerberos authentication for clients and server identity validation○ Strong encryption for sessions○ Cryptographic signatures for packets to detect message tampering

● Openssh is the OS software that implements SSH protocol version 2● Currently SSH is the most used and useful remote administration tool

50

Linux system administration © Jani Jaakkola 2019

SSH 2 protocol● Allows multiple streams inside same session (multiplexing)

○ Terminal sessions○ Running commands without a terminal (remote piping)○ TCP connection tunneling○ Running remote windows using X-protocol○ Simple tunneling using a socks proxy or VPN network interfaces

■ Shuttle is a program that implements whole TCP/IP stack inside user space program to implement tunneling over ssh

● Implements SFTP remote file access protocol○ Sftp from command line○ Sshfs user space remote mounts

● Plenty of command line tools can use SSH pipes for remote access:○ rsync, subversion, tar, git

51

Linux system administration © Jani Jaakkola 2019

SSH - transferring files● Interactive sftp can be done from command line. This is boring.● Rsync is a tool to transfer and synchronize multiple files with one command

over ssh (rsync needs to installed on both sides of the connection)○ rsync file.txt melkki.cs.helsinki.fi:

○ rsync --archive melkki.cs.helsinki.fi:directory .

● With sshfs you can (FUSE) mount remote directories○ sshfs users.cs.helsinki.fi:/home/jjaakkol users

● Linux file managers understand sftp remote shares directly○ sftp://users.cs.helsinki.fi/home/jjaakkol

● Pipes work over ssh with a few caveats:○ ssh melkki.cs.helsinki.fi cat /dev/zero | pv > /dev/null

52

Linux system administration © Jani Jaakkola 2019

SCP Needs to die 20 years ago.● Do not use scp and don’t let your friends use it.● It belongs in the past:

○ Scp is a wrapper for old rcp protocol from the dawn of the Internet.○ Server selects which files it sends, not the client.○ Stderr is redirected straight from the server.

● https://sintonen.fi/advisories/scp-client-multiple-vulnerabilities.txt

53

Linux system administration © Jani Jaakkola 2019

SSH public and private keys● SSH authenticates servers with SSH keys and can authenticate clients too● Keys are generated with ssh-keygen● The private parts of the keys must be kept secret

○ Private keys should be encrypted on storage○ Ssh-agent and Gnome-keyring can temporarily save keys in RAM○ Ssh client then accesses the agent to use the private key and won’t need direct access to the key.○ Ssh-agent key access can be forwarded over a ssh session.

● The trusted public keys are saved on .ssh/authorized_keys○ The authorized_keys file syntax allows to limit key access.

● A compromised ssh server can steal your password, but it never has access to your private key.

54

Linux system administration © Jani Jaakkola 2019

Openssh installation● Apt/deb: apt install openssh-server

● Red Hat / RPM: dnf install openssh-server

● Server configuration and server keys are located /etc/ssh

55

Linux system administration © Jani Jaakkola 2019

SSH client configuration● Configuration file is .ssh/config● Can have ssh server specific sections and ssh server aliases

○ Server specific user names○ Server specific private keys

● Ssh jump host tunneling is enabled with ProxyJump directive

56

Linux system administration © Jani Jaakkola 2019

SSH common problems● SSH server key management is a mess

○ SSH server keys can be signed by other keys■ But which key?

○ SSH server identity can be verified with Kerberos○ SSH server keys can be saved to DNS. If DNS could be trusted.

● SSH password brute forcing is endemic and constant in Internet○ You should always disable ssh password authentication from public network facing servers.

● Private keys must be protected from theft.● Tunneling X clients over ssh sessions is dangerous

○ You should only do it from hosts which you trust.

57

Linux system administration © Jani Jaakkola 2019

SSH Client software● Windows

○ Putty○ Tectia ssh (used to be ssh.com)○ With Linux subsystem, just openssh

● Mac: openssh● Android:

○ Irssi connectbot○ Juicessh (with mosh support)

● Browser: Chrome ssh plugin

58

Linux system administration © Jani Jaakkola 2019

Screen, tmux and mosh● Screen and Tmux are terminal emulators who run inside terminal emulators.● Enable running multiple sessions in one terminal.● Make possible disconnecting a session and attaching it from another terminal.

○ Even when disconnected accidentally.○ Or to attach multiple terminals to same session

● When you run something which takes a long time in another host, you should always start a Screen or Tmux session.

● Mosh is a UDP (instead of TCP) based encrypted terminal session.○ Connection is negotiated over plain SSH.○ There is no latency caused by TCP retransmits or IP address changes.○ Can immediately resume suspended sessions on laptops and tablets and phone.

59

User accounts

/etc/passwd, /etc/group, /etc/shadow… and the root account

60

Linux system administration © Jani Jaakkola 2019

User accounts● A Linux (and Unix) user or system account needs:

○ A login name (mine is jjaakkol)○ A primary UID: user identification number (mine is 1033431)○ A primary GID: group identification number (group hyad-all, gid 300000)

● Also usually:○ A gecos field: user’s full name (used to contain room number, phone number, …)○ A home directory: /home/jjaakkol○ A default shell : /bin/bash○ Hashed and salted password○ Account expiration date○ Password validity time and timestamp of last password change○ A list of additional groups the user belongs to (like sudo for sysadmins)

61

Linux system administration © Jani Jaakkola 2019

Files for user account information● Public user account information is located in /etc/passwd

○ Login name, password field, uid, primary gid, gecos, home directory, shell

jjaakkol@melkki:~$ getent passwd jjaakkol

jjaakkol:x:1033431:3000000:Jaakkola Jani E:/home/jjaakkol:/bin/bash

● Group list and group members is located in /etc/group● Normally every user gets his own primary group, named the same as login name.

jjaakkol@loussa:~$ getent group sudo

sudo:x:27:adminjj,jjaakkol

● Private user account information is located in /etc/shadow○ Salted and hashed password, account expiration time, password change time

62

Linux system administration © Jani Jaakkola 2019

Logging in● Provided by some application implementing account validation and login process

○ Graphical login implemented by display manager: GDM, Lightdm○ Network terminal login implemented by Openssh○ The kernel console (and serial) login implemented by getty and /bin/login

● Changing the user account:○ Switching to other user account with user’s password: /bin/su○ Switching to other user account with your own password: /bin/sudo

■ Admin access is implemented by sudo

63

Linux system administration © Jani Jaakkola 2019

The root account● The system administration account is called root and it has uid 0● Root account can do anything:

○ Access and modify any file, kill and debug any process, change state of any driver, access raw devices, ...

○ It is futile to try and restrict root’s access rights. There are always ways around restrictions.○ Typically, there is nothing preventing root account from doing something stupid.

● You should never use root account for anything but administrative tasks!○ Just accidentally running a normal user program with root account can and will break things.○ Windows administrator’s account is different: there are all kinds of protections from accidentally

breaking the system and system does not break if you run a browser with a administrator account

64

Linux system administration © Jani Jaakkola 2019

Managing user accounts● Adding user accounts: adduser

● Adding groups: addgroup

● Deleting user account: deluser

● Deleting group: delgroup

● Managing shadow file: chage

● Changing a password: passwd

● Changing a shell: chsh

65

Package Management

Red Had Package Manager (RPM)Debian Package Management (deb packages)

66

Linux system administration © Jani Jaakkola 2019

(Software) package● Components of a Linux system are distributes in packages.

○ A package can contain software, libraries, drivers, documentation, kernel, virtual machine …■ Anything that could be useful when installed to a Linux system.

● A single package contains files, installation scripts and metadata● Metadata includes:

○ Name of the package○ Version number○ Changelog○ Dependencies of the package: the other packages that are required for this package to work○ Description, installed size, maintainers, copyright, …○ Cryptographic signature of the contents of the package

● Debian packages are in *.deb files, Red Hat packages in *.rpm files.

67

Linux system administration © Jani Jaakkola 2019

Package manager● Linux system needs hundreds (or thousands) of software packages to work

○ There is no standard Linux installation with predefined components.○ As specified by FHS, packages install files all over the file system.

● Package manager:○ Installs, removes, and upgrades packages○ Maintains a database of installed files, packages and their metadata○ Maintains and resolves trees of package dependencies:

■ Required libraries, required files, required features■ Provided libraries, provided files, provided features■ … and possible conflicts

○ Provides tools for dependency resolving

68

Linux system administration © Jani Jaakkola 2019

Package repository● A software repository, colloquially known as a "repo" for short, is a storage

location from which software packages may be retrieved and installed on a computer (wikipedia).

● Typically contains:○ A list of installable packages and their metadata○ Cryptographic signature of the metadata○ The packages themselves (and their sources and build scripts, if available)

● Provided over http(s) from a web server○ File shares and media (DVD or USB) are sometimes used.

● A distribution comes with a default list of repositories.○ Repositories for updates, security updates, 3rd party software, non OS software, different

computer architectures

69

Linux system administration © Jani Jaakkola 2019

Installing a package from a repository1. Repository metadata is downloaded2. Digital signature of the metadata is verified3. The required package and its missing dependencies are downloaded4. Digital signatures of the downloaded packages are verified5. Pre-installation scripts are run6. Files are installed to the system7. Post-installation scripts and configuration are run8. Database of installed packages is updated

70

Linux system administration © Jani Jaakkola 2019

Security considerations● Package configuration and installation are always done with system

administrators (root) rights.● There are no lists of mandatory permissions for the installed software.

○ Software run by a user can access all files and all devices a user can access.○ … or when there is access restrictions, they are listed by the package, not by the system.

● There is nothing stopping 3rd party repository from providing “updates” to system software.

○ If you install a 3rd party repository, it can install anything to your system. ● You should be very careful when using packages which aren’t provided by your

distribution.● You should always use package signature verification.

71

Linux system administration © Jani Jaakkola 2019

Runtime updates● Linux (and Unix) systems allow removal and replacing of open files

○ The files are kept hidden in the file systems, until all processes holding them open are gone.● In particular a binary or a mapped library of a running process can be replaced

○ The running process will have access to the old version of the library.● Updates can be installed online, without a restart of the system.● However, all processes using old libraries and binaries should eventually be

restarted. A reboot is normally the simplest way to do this.● Kernel can not be upgraded without a reboot.

○ However, there is a method to fix kernel security holes while the kernel is running: ksplice○ In theory, a kernel module could be separately updated, unloaded and reloaded.

72

Linux system administration © Jani Jaakkola 2019

Debian package manager● Installed package database is in /var/lib/dpkg

○ Text dabase● Packages are named:

○ <package-name>_<epoch>:<upstream-version>-<debian.version>-<architecture>.deb○ firefox_64.0-1_amd64.deb

● Packages are managed with /usr/bin/dpkg● Dependencies are resolved and packages downloaded with /usr/bin/apt● Packages have a configuration system called debconf● Different tools search packages and their contents: apt-cache and apt-file● Software repositories are configured in /etc/apt/

73

Linux system administration © Jani Jaakkola 2019

RedHat package manager● Installed package database is in /var/lib/rpm

○ Binary database● Packages are named:

○ <package-name>-<upstream-version>-<redhat.version>.<architecture>.rpm○ firefox-64.0.2-1.fc29.x86_64.rpm

● Packages are managed with /usr/bin/rpm● Dependencies are resolved and packages downloaded with /usr/bin/dnf

○ Used to be called /usr/bin/yum○ Does also package searching and content listing.

● DNF global configuration file at /etc/dnf/dnf.conf● Repositories under /etc/yum.repos.d

74

Linux system administration © Jani Jaakkola 2019

Any differences between rpm and deb?● Both can handle package installations, upgrades, dependency resolving, package

building, signature verifying just fine● Rpm is slightly faster, however both are very slow.● Rpm does not have a configuration system like debconf

○ This is mostly a good thing for rpm● With rpm you need one tool, dnf● With debian: dpkg, dpkg-deb, dpkg-query, dpkg-buildpackage, apt,

apt-cache, apt-key, apt-search, apt-file, aptitude …○ This is by no means an exhaustive list.

● The lecturer would prefer rpm for simplicity, speed and lack of debconf.○ The differences really are minor.

75

Linux system administration © Jani Jaakkola 2019

Debian apt cheat sheet● Apt is the Advanced Package Tool of Debian distribution.

○ Apt is a high level tool for package downloads and depency solving● Update repository data: apt update

● Install package(s): apt install <packages>

● Upgrade packages: apt upgrade <packages>

● Remove package: apt remove○ Remove also configuration data: apt purge <packages>

○ Remove unnecessary packages: apt autoremove

● Search packages: apt search

● List packages: apt list

● Show package metadata: apt show

76

Linux system administration © Jani Jaakkola 2019

Apt repository configuration● List of system repositories:

○ /etc/apt/sources.list and /etc/apt/sources.list.d/● List of trusted GPG signatures for package signing:

○ /etc/apt/trusted.gpg and /etc/apt/trusted.gpg.d● When managing repositories always make sure that the correct GPG-public

keys are available.● Ubuntu PPA repositories are community maintained Personal Package

Archives○ PPA repository can be added with single command: add-apt-repository ○ Every PPA has its own package signing key○ Oracle JDK 8 is a PPA

77

Linux system administration © Jani Jaakkola 2019

Debian and package configuration: debconf● Debian packages have a list of packages configuration files

○ Usually in the /etc directory● If a configuration has been changed, changes do not get overwritten on package

upgrade● A system called debconf can generate configurations for packages and

interactively ask questions when package is installed.○ It can be very annoying, when large install stops in the middle for stupid questions.○ The answers to questions can be “preseeded”, but only way to know the questions is to install

the packages.● Can be invoked with: dpkg-reconfigure package

78

Linux system administration © Jani Jaakkola 2019

Low level package management● Low level package handling tool is /usr/bin/dpkg● Does not do dependency resolving.● Useful for inspection of and installation of single downloaded packages● Install package: dpkg --install <package file>

● Remove package dpkg --remove <package>

● Purge package: dpkg --purge <package>

● List installed packages: dpkg --list

● List files of a given package: dpkg --listfiles <package>● Verify installation of a package: dpkg --verify <package>

79

Linux system administration © Jani Jaakkola 2019

Debian package management internals● Package database are maintained as text files in /var/lib/dpkg● Packages can be downloaded with apt download <package>● A package consists of files installed to system and metadata.

○ The .deb package is really an ar archive and can be manually investigated○ Inside the archive there are .tar.gz files for installed files and for metadata

● Package sources can be downloaded with apt source● Packages can be rebuilt with dpkg-buildpackage● Package installation history and logs can be found from /var/log/dpkg and

/var/log/apt

80

Linux system administration © Jani Jaakkola 2019

Red Hat high level package management (dnf)● DNF (Dandified Yum) is a high level tool for Red Hat package management● Update repository data: dnf check-update

● Install package(s): dnf install <packages>

● Upgrade packages: dnf update <packages>

● Remove package: dnf remove <package>

● Search packages: dnf search

● List installed packages: dnf list installed

● List available packages: dnf list available

● Show package metadata: dnf info <package>

● List of repositories: /etc/yum.repos.d

81

Linux system administration © Jani Jaakkola 2019

RedHat low level package management● Install a package: rpm -i <package file>

● Upgrade a package: rpm -U <package file>

● Remove a package: rpm -e <package>

● Verify package installation: rpm -V <package>

● List installed packages: rpm -qa

● List install scripts of a package: rpm -q --scripts <package>

● Find package containing a file: rpm -qf <file>

● Package database: /var/lib/rpm

● Package update history: rpm -qa --last

82

Linux system administration © Jani Jaakkola 2019

Snap packagesSnappy is a software deployment and package management system originally designed and built by Canonical for the Ubuntu phone operating system. The packages, called snaps and the tool for using them, snapd, work across a range of Linux distributions allowing distro-agnostic upstream software packaging. The system is designed to work for internet of things, cloud and desktop computing. [wikipedia]

● Self contained and not dependent on distribution.● Distributed as a compressed file system image.

83

Linux system administration © Jani Jaakkola 2019

FlatpakFlatpak (formerly xdg-app) is a software utility for software deployment, package management, and application virtualization for Linux desktop computers. It provides a sandbox environment in which users can run applications in isolation from the rest of the system. Applications using Flatpak need permission from the user to control hardware devices or access the user's files. [wikipedia]

84

Linux Kernel

85

Linux system administration © Jani Jaakkola 2019

Linux Kernel● Scheduler:

○ Processes and threads● Memory manager:

○ Memory allocation, file system caching, swap● Storage:

○ Block devices, physical and virtual● Networking:

○ TCP/IP, WiFi, routing, bridging, firewalls● Human interface devices:

○ Mice, keyboards● Drivers and hardware access

○ Power management, virtualisation

86

87

Linux system administration © Jani Jaakkola 2019 88

Kernel features● Linux is feature complete. All features of a modern OS are available.

○ Since version 2.6. ● The kernel is scalable

○ All important functions can run on multiple cores○ Locks are “small”○ NUMA, non uniform memory architecture is supported

● Multiple filesystems○ Local and network filesystems.○ Volume management on block level (LVM) and filesystem level (btrfs).

■ ZFS is available, but owned by Oracle.○ User space file system support (FUSE)

Linux system administration © Jani Jaakkola 2019

More kernel features● Networking stack is complete

○ Currently networking protocols are implemented and tested first on Linux kernels○ IPv4 and IPv6 support○ Supports multiple different VPNs, including IPSEC○ User space VPN support, with multiple protocols

● Power management○ Kernel implements ACPI power management standard for PC’s○ Suspending and hibernation is implemented in kernel

■ Hibernation is likely to expose driver bugs and therefore disabled in Ubuntu by default○ CPU sleep states and frequency adjustment○ Interrupt management and minimising the amount of interrupts

89

Linux system administration © Jani Jaakkola 2019

Kernel and hardware● Drivers

○ Linux is important enough, that drivers exists for (almost) all modern hardware○ Most drivers are implemented in the kernel○ However, there are still proprietary drivers outside kernel tree (Nvidia GPU driver is one)○ … there still exists drivers which are so bad that you should just avoid them.○ Busses: PCI, USB, SCSI, Thunderbolt○ Network: Ethernet, WiFi, Bluetooth○ HID: human interface devices○ ALSA: sound devices○ Block devices: SATA, SAS, Hardware and software RAID, iSCSI, multipath

● Virtualisation○ KVM enables user space program to emulate a physical machine in user space○ Supports hardware passthrough to virtual machines

90

Linux system administration © Jani Jaakkola 2019

Kernel modules● Linux kernel modules are loadable ELF binary libraries which provide features

to the kernel.○ Drivers, filesystems, complete subsystems (eg. Bluetooth).

● Mostly they are loaded on demand loaded by the system.○ They have a dependency tree, where lower level modules need features of higher level modules.

● Driver modules provide metadata so that the correct module can be identified from the identifier of hardware controller.

● Modules have load time arguments. ○ Sometimes you need to use them.

91

Linux system administration © Jani Jaakkola 2019

Module init tools● Software package for managing and querying kernel modules.● Loading a module and giving it arguments: modprobe

● Remove a module from kernel: rmmod

● List loaded modules: lsmod

● Show the metadata provided by module: modinfo

● Create the dependency tree: depmod

● Configuration file: /etc/modprobe.conf○ List blacklisted modules: some hardware has multiple drivers.

■ Linux kernel internal Nvidia GPU driver needs to be blacklisted when using Nvidia’s proprietary driver.

92

Linux system administration © Jani Jaakkola 2019

DKMS: Dynamic Kernel Module Support● Linux kernel does not have any stabile binary API for kernel modules.● Therefore, proprietary kernel drivers need to be compiled and linked against

every new kernel.● DKMS is a system which automates this and is used for proprietary drivers.

○ The most important being Nvidia kernel GPU driver.● Let’s be clear: You don’t want to use DKMS.

○ Sometimes you must.

93

Linux system administration © Jani Jaakkola 2019

Torvalds opinion on Nvidia GPU driver

94

Linux system administration © Jani Jaakkola 2019

Compiling your own kernel● Download, Extract, then try it…● https://www.kernel.org/

95

Hardware and firmware

CPU, motherboard, firmware, PCI, SCSI, USB, ...

96

Linux system administration © Jani Jaakkola 2019

Motherboard● Motherboard and motherboard chipset provides most desktop’s or laptop’s built

in hardware.○ Nowadays the motherboard chipset is already built in to the CPU.

● May contain an integrated GPU (Graphics processing unit)○ The GPU is usually already built in to the CPU too.

● Provides a socket(s) for the CPU.● Provides the bus to RAM memory and peripherals.● The boot and runtime firmware (BIOS) is located on the motherboard.

○ System start and power management are managed by BIOS.○ Also possible remote management features.

● Manages secure boot and system lock down features!

97

Linux system administration © Jani Jaakkola 2019

CPU● Used to be the most important performance defining thing when choosing HW.● All modern hardware will have multiple CPU cores, up to hundreds.

○ Intel Core i9: 8 cores, 16 hyperthreads○ Intel Xeon platinum: 28 cores, 56 hyperthreads.○ AMD Ryzen Threadripper: up to 32 cores○ ARM server CPU’s: 64 cores or more.

● Software needs to be massively parallel to get most out of a CPU.● For all CPU vendors Linux platform is so important, that the kernel support for

CPU’s is released first, later the actual hardware.● AMD’s Ryzen 2 CPUs in year 2019 seem to be very competitive.● CPU indentification: cat /proc/cpuinfo

98

Linux system administration © Jani Jaakkola 2019

RAM (Random Access Memory)● Unlike CPU’s, memory is sold by multiple vendors with standard interfaces.● DDR2 SDRAM: Double data rate synchronous dynamic RAM interface.

○ 3200 - 8500 MB/s○ The oldest memory you are likely to see still in use.

● DDR3 SDRAM: Double data rate type three SDRAM○ 8.5GB/s - 15GB/s

● DDR4 SDRAM: Current, released in 2014○ 17GB/s - 25.6GB/s

● DDR5 SDRAM: ready in 2020● Column Access Strobe (CAS) latency, or CL, is the delay time between memory

access and the moment the data is available from the memory module.

99

Linux system administration © Jani Jaakkola 2019

Firmware: BIOS, UEFI and ACPI● When PC is booted, the first software to start is the firmware.

○ Either old legacy BIOS (Basic Input Output system).○ Or modern UEFI (Unified Extensible Firmware Interface)

● Firmware initializes hardware and starts the OS.○ And needs to be updated, since it will have bugs.

● Can provide remote management features.○ Intel Management Engine, various remote consoles (Dell Drac, HP ILO).

● With TPM (Trusted Platform Module) can validate integrity of the booted OS.○ And provide hardware secure storage for encryption keys.

● ACPI (Advanced Configuration and Power Interface) ○ Run time part of firmware, which handles power management with the OS.

● Firmware information: /usr/bin/dmidecode and /proc/acpi

100

Linux system administration © Jani Jaakkola 2019

Bus types and hardware lists● Hardware and peripherals are connected to CPU through buses.

○ Some of them are only internal to the CPU.○ There is a hierarchy of BUSses from faster to slower, where the faster ones are located nearer

(physically!) to the CPU.○ Buses provide hardware enumeration (listing of devices connected to the bus).

● PCI and PCI-e are the native and fastest interfaces to the CPU.○ NVM Express (NVMe): Non-Volatile Memory Host Controller Interface Specification is a

specification for storage attached directly to PCI-e● USB (Universal Serial Bus) is useful for external devices.● SATA is a bus standard for internal hard drives.● SCSI is an old bus protocol, which is still is in use, even when actual SCSI

hardware is not.

101

Linux system administration © Jani Jaakkola 2019

PCI and PCIe bus● PCI bus is directly connected to the CPU (Intel Pentium i586 CPU invention)● Modern version is called PCIe (Peripheral Component Interconnect Express)

○ Replaces old PCI, PCI-X and AGP standards.○ Multiple PCIe lines to the CPU. Not a single shared bus anymore.

● Different interface ports based on the amount PCIe lines○ One line has 1 GB/s speed○ USB adapter and ethernet adapter manages with x4 slot, GPUs use x16 slot.○ … so PCIe 3.0 (current) has speeds up to ~16 GB/s○ 2019 should see PCIe 4.0 devices with speed up to 31.5GB/s○ Thunderbolt with USB-C compatible connector provides up to 5GB/s speed to external devices.

■ Enough to run an external GPU and multiple monitors and USB devices.○ NVMe is a direct interface to PCIe to provide storage access.

● List PCI devices: lspci, lspci -v102

Linux system administration © Jani Jaakkola 2019

USB● Universal Serial Bus: standard for external devices.● Standard cables and standard active repeaters called USB hubs.● USB 2 max speed: 0.48Gbit/s● USB 3 max speed: 5 Gbit/s

○ Enough for 1Gbit/s network, but not for 10Gbit/s.● USB 3 with type C interface: 10 Gbit/s

○ Can provide power for connected devices.○ Also drive monitors with DisplayPort protocol.

● List USB devices: lsusb○ Tree view listing USB adapters, ports, hubs and connected devices: lsusb -t

103

Linux system administration © Jani Jaakkola 2019

SCSI● Protocol to access different hardware devices over SCSI bus.

○ Only disk access protocol is still in use. ● Protocol has stayed the same, even when the hardware buses implementing

them are gone.● Used for usb-storage (SD cards), DVD-rom drives, SATA-drives

○ And this is why block devices are named /dev/sda, /dev/sdb■ SCSI disk a, SCSI disk b

● iSCSI is a protocol to enable SCSI access over IP network.○ Commonly used to provide storage access from a Storage Area Network.

● List SCSI devices: lsscsi

104

Linux system administration © Jani Jaakkola 2019

Complete hardware list● Just use lshw

105

Booting a PC

There are ancient and brand new dragons here...

106

Linux system administration © Jani Jaakkola 2019

Starting a Linux PC● PC firmware is started (BIOS or UEFI)

○ Devices are initialized and scanned: CPU, memory, devices, disks○ Device controllers firmware initialization is run (network card, RAID controllers)○ The boot device (disk or network device) is decided.○ either by non-volatile firmware configuration or manually.

● The firmware loads and starts a boot loader from the boot device.○ GNU Grub2 or Windows boot manager (there are others however)○ Boot loader loads its configuration from disk or network.○ Boot loader loads kernel and initial ramdisk (initrd) to memory.

● Boot loader starts Linux Kernel.○ Kernel starts user space software on initrd○ Initrd loads drivers and locates the OS system installation

● Initrd starts the Linux distribution107

Linux system administration © Jani Jaakkola 2019

Starting a PC - TL;DR version1. PC is booted.2. PC firmware is started.3. Firmware locates boot device.4. Firmware starts boot manager.5. Boot manager loads kernel and initial ramdisk.6. Boot manager starts kernel.7. Kernel starts software on initrd.8. Initrd loads drivers and starts the Linux operating system.

108

Linux system administration © Jani Jaakkola 2019

Legacy BIOS boot● Legacy BIOS does not know how to use file systems● It can read the MBR (Master Boot Record) partition table● Boot loader is installed in known location in the MBR.

○ The boot loader is started from MBR in 16-bit non protected mode, just like MS-DOS○ This is the only place left where 16-bit non protected mode is used in 2019.○ There can be only one MBR boot loader on a given disk.

● The MBR partition table has a maximum size of 2T. ● … Amazingly, this system actually works.

○ Virtual machines are still mostly booted this way, even when PCs are not.

109

Linux system administration © Jani Jaakkola 2019

UEFI boot

110

Linux system administration © Jani Jaakkola 2019

UEFI Secure boot

111

Linux system administration © Jani Jaakkola 2019

Grub 2 Linux boot loader

112

Linux system administration © Jani Jaakkola 2019

Initial ramdisk - initrd

113

Linux system administration © Jani Jaakkola 2019

Network boot

114