SYSTEM ADMINISTRATION COMPUTER SCIENCE DEPARTMENT COLORADO STATE UNIVERSITY COMPUTER SCIENCE...
-
Upload
toby-malone -
Category
Documents
-
view
222 -
download
3
Transcript of SYSTEM ADMINISTRATION COMPUTER SCIENCE DEPARTMENT COLORADO STATE UNIVERSITY COMPUTER SCIENCE...
SYSTEM ADMINISTRATIONCOMPUTER SCIENCE DEPARTMENT
COLORADO STATE UNIVERSITYCOMPUTER SCIENCE DEPARTMENT
WAYNE TRZYNAFALL 2012
CT 320: Network and System Administration
CT320: Fall Semester 2012
2
Topics
1. Organization2. Hardware and Facilities3. Operating Systems4. Processes5. Applications6. Scripting7. Networking & Security8. Wrap-up
9/25/12
CT320: Fall Semester 2012
3
Organization
People: 21 faculty, 5 instructors, 13 staff members, 350 undergrads, 200 grad students
Systems: 600+ total, ~450 Linux, ~170 Windows, ~40 Macs, mainly HP workstations and servers
Other: ~17 shared printers, ~33 switches, wired and wireless networks, Linux lab, Windows lab
Staff: Wayne Trzyna (full time, ~35 years), Paul Hansen (half time, ~10 years), 3 half-time GSAs
Users: Password file currently has ~2500 passwords, 100’s of new accounts every semester
9/25/12
CT320: Fall Semester 2012
4
Sysadmin (1)
Sysadmin group is responsible for large diverse set of hardware and software systems!
Graduate assistants handle first level support: new accounts, password problems, disk quotas, etc.
Graduate assistants train each other, generally they work in organization for 1 to 3 years.
Network security group has another set of servers and systems, and their own sysadmin team.
Contact sysadmin via email ([email protected]) or by walking into upstairs office.
9/25/12
CT320: Fall Semester 2012
5
Sysadmin (2)
Responsibilities include:Procurement of hardware and softwareInstallation of hardware and softwareMaintenance of hardware and softwareManagement of periodic processesProviding support for usersWorking with facilities (new building!)Troubleshooting problemsSoftware licensingStrategic planning
9/25/12
CT320: Fall Semester 2012
6
Server Room
Locked room with three high capacity HVAC units (19 tons total cooling), 50KVA Uninterruptable Power Supply (good for about 20 minutes), and 10Gb fiber optic network link to campus and the internet.
Facilities include rack-mounted and stand-alone production servers, along with extra servers for testing and development, tape loaders for backup, network ‘backbone’ switch.
Core services include file service, directory service, database (mysql), print service (cups), web-service (apache), software licence service, etc.
Servers run Red Hat, vs Fedora, for stability.compute servers include: 78-node single-processor 4-core
system (lattice), and several 4-processor many-core large-memory (512 Gb) machines. 9/25/12
CT320: Fall Semester 2012
7
Operating Systems
Fedora, Red Hat, W2K, WinXP, Win7, MacOs operating systems supported.
Philosophy is to minimize versions of operating systems and apps. ‘Never ending battle of consolidation.’
OS upgrades often move or rename configuration files, change command parameters, etc.
Many problems caused by dependencies on specific versions of shared libraries, duplicate libraries, etc.
Multiple versions of libraries (and compilers) must often be maintained.
9/25/12
CT320: Fall Semester 2012
8
Processes
Cloning and installation of Linux and Windows operating systems.
Monitoring of disk quotas/space, expired accounts/access, file logging/pruning, security.
Automatic backup of user data which is stored on servers and mounted remotely to workstations.
Automatic distribution of passwords, groups, and other account information to workstations.
Primarily depends on cron to implement periodic processes.
9/25/12
CT320: Fall Semester 2012
9
Processes: OS Updates
Signature process is the continual update of operating system and application software.
Follows major Fedora releases, approximately on a schedule of every six months.
Security requires kernel and other patches to be pushed to system every week.
Basic strategy is to remote mount user data from servers to avoid hosting storage on workstations.
Avoids the need to backup individual workstations, allows system software to be updated cleanly.
All fedora clients run the same cloned system image.
9/25/12
CT320: Fall Semester 2012
10
Processes: Cloning (1)
1. Must initially produce a source system with the latest software, applications, drivers.
2. Knowledge is maintained with an evolving document that describes process (1000+ lines).
3. After creation/verification of the system, an image is copied with the dump command.
4. The target system has its disks formatted and partitioned (root, boot, swap, tmp).
5. The restore command is then used to copy the image to the target system.
6. All this is built based on scripted modifications to the Fedora rescue CD. 9/25/12
CT320: Fall Semester 2012
11
Processes: Cloning (2)
6. First, the target system is booted from a RAMDISK image on the customized rescue CD.
7. Then the disk is repartitioned, and filesystems and swap-space are initialized.
9. Next the system clone-image is restored onto the root partition.
10. A customization script sets up IP addresses, host names, etc., and creates the boot block for GRUB.
11. Finally the system is rebooted from the new disk image. 9/25/12
CT320: Fall Semester 2012
12
Processes: Backup
Backup is primarily limited to server machines, since only servers store user data. We also back up one of each unique client clone image.
Servers (bach, parsons, chopin) each have around 5TB of data in multiple partitions.
Backup is via LTo4 tape drive with approximately 1.6TB capacity per tape.
Backups use homebrew system built on dump and restore. (Many organizations use open source or commercial packages: Amanda, Veritas, Tivoli, etc.)
Full backups are performed weekly, incremental backups daily. Fulls and incrementals are interleaved on daily tapes.
9/25/12
CT320: Fall Semester 2012
13
Processes: Miscellaneous
Printing: CUPS server and clientsAccounts: MOAA – Mother of All Accounts
Maintains a NDBM database of accounts CSUID, full-name, login, expiration, etc Set of C programs developed internally Builds system data-structures such as passwd file,
group file, etcX Windows: Problematic nVidia driversWeb: Apache server on parsonsWikis: MediaWiki and other solutionsVirtualization: Limited use within department
(but many other organizations have embraced it)9/25/12
CT320: Fall Semester 2012
14
Scripting
Automated processes are implemented as scripts, many of which are home grown.
Scripts are written using csh or bash or Perl languages, and some C programs.
Limited use of Python in house, but some 3rd party scripts use it.
Scripts are documented and must continually be updated as processes and operating systems evolve.
9/25/12
CT320: Fall Semester 2012
15
Scripting Example
echo_status "Mounting root filesystem"status_busymkdir ${TGT_DIR}mount ${CLONE_DEVICE}${ROOT_PARTNO} ${TGT_DIR}status_done
echo_status "Restoring from clone image"status_busycd ${TGT_DIR}gzip -d < ${IMAGE_DIR}"/"${CLONE_IMAGE} | restore rf -status_done
echo_status "Cleaning & Configuring ${CLONE_HOST}"status_busy#leftovers from restorerm -f ${TGT_DIR}"/restoresymtable"rm -f ${TGT_DIR}"/.autofsck"
9/25/12
CT320: Fall Semester 2012
16
Crontab Example
20 * * * * /usr/lib/sendmail -Ac -q## common server stuff starts here:#15 4 * * * find /tmp/ -ctime +7 -a -exec rm -f {} \; > /dev/null 2>&115 4 * * * find /var/tmp/ -ctime +7 -a -exec rm -f {} \; > /dev/null 2>&130 * * * * /s/bach/i/sys/sa/cron/check-xsession-errors0,30 8-22 * * * /s/bach/i/sys/sa/cron/check-load 4.05,15,25,35,45,55 * * * * /s/bach/i/sys/sa/cron/uptime-log## cronjobs specific to chopin start here:##01 0 * * 6 (/usr/bin/find /s/chopin/ -name core -type f -depth -exec /bin/rm -f {} \; ) > /dev/null 2>&1#01 0 * * 7 (/usr/bin/find /s/chopin/ -name 'core.[0-9]*[0-9]' -type f -depth -exec /bin/rm -f {} \; ) > /dev/null 2>&10 1 * * * (cd /usr/local/etc/dumps/bin; ./run.bach )55 * * * * /s/bach/i/sys/sa/cron/check-dns40 17 * * 1-5 /bin/csh -c "(cd /s/bach/i/sys/sa/hosts ; ./hosts.update > & ./logfile)"0 8 * * * /usr/local/etc/dumps/bin/check_logs.pl --config=/usr/local/etc/dumps/lib/check_logs.cfg 1>/dev/null 2>/dev/null0 0 1 * * /usr/local/etc/log-rotate…
9/25/12
CT320: Fall Semester 2012
17
Applications
Must support diverse requirements for applications and tools.
Multiple versions of Firefox, GNU compilers, Java compilers, Eclipse tool, etc.
Operating heuristic is to require a Fedora RPM for each application, if possible.
Avoid 3rd party RPM packages, build directly from source instead.
Applications that are not shipped with Fedora reside in /usr/local.
Examples: android, apache, cuda, ffmpeg, hadoop, lapack, netbeans, python, R, scipy, tau, …
9/25/12
CT320: Fall Semester 2012
18
Networking & Security
Outside network connection is managed by ACNS, not the department.
Some level of protection provided by firewalls at ACNS border router and department router interfaces.
Must continually stay one step ahead of hackers by keeping patches up to date.
Homebrew “Tripwire” system monitors system files for unexpected changes.
Require users to select good passwords, enforced by password command.
Linux is not as susceptible to viruses, Windows systems protected by Symantec.
9/25/12
CT320: Fall Semester 2012
19
Wrap-up
What are the essential skills required?Handle pressure wellGood people skillsBroad technical backgroundTroubleshooting skillsOrganization and design skillsWhat is satisfying about the job?Immediate feedbackIndependence (If you keep things running
well, people leave you alone.)Excitement! 9/25/12