Bacula: The Open Source Commercial Backup Solution
Download: Bacula.pdf
Other Languages: Polish, Spanish
Contents
Keywords: Bacula, s3cmd, backup, incremental, differential, full, tape, commercial grade, autochanger, debian, ubuntu, disaster recovery plan, gpg, amazon s3, offsite storage
Install Bacula
- Install Bacula and its components
aptitude install bacula aptitude install bacula-director aptitude install bacula-sd aptitude install bacula-fd
- [Recommended] You should install bacula director mysql, and mysql admin
aptitude install bacula-director-mysql aptitude install mysql-admin
Configure Bacula
- Bacula is almost ready to run, you just need to modify few things.
The best way to learn is to print below conf files and read them while you reading this manual.
- Director bacula-dir.conf controls what will get run, when it will get run and what client will you be backing up from.
- Storage Daemon bacula-sd.conf controls which director can talk to it, where and what device will it store the files. Options: HDD, File, DVD, Autochanger, DLT,DDS, DDS4, Onstream, DDS3, Exbyte, NAS
FileDaemon bacula-fd.conf needs to be installed on all clinets that you want to backup. This program controls getting the files from client system and sending it to director. Configuration controls which director can use it.
All bacula configuration files are in /etc/bacula
/etc/bacula/ |-- bacula-dir.conf |-- bacula-dir.conf.dist |-- bacula-fd.conf |-- bacula-sd.conf |-- bconsole.conf
Network Binding
If you want to enable backup over the network comment out the lines that have Address in any of dir, sd,fd conf files.
#DirAddress = 127.0.0.1 or #SDAddress = 127.0.0.1 or #FDAddress = 127.0.0.1
Director
- Director bacula-dir.conf controls what will get run, when it will get run and what client will you be backing up from.
Brief overview:
- Basic unit is a Job ( one job, one client, one schedule, one storage, one pool)
- Name – Unique name
- Type – What to do: backup, Backup, Migrate, Admin, Restore
- Level – Backup level type: Full, Differential, Incremental
- Client – Where to get the files (machine name). Name of the Client{}
- Storage – Where to put the files (which hardware). Name of the Storage{}: File, DDS-4, 8mmDrive, DVD, etc.
- Pool – Which set of Volumes (tapes, disk) to use. Name of the pool{}
- Schedule – When to do it. Name of the Schedule{}
You will need to change the following:
- Passsword -You can change the password for dir,sd,fd,console. Make sure it matches in all files.
Under the section FileSet { Name = "Full Set"... change the files you want to back up. Example File=/home/myusername/
Under the section Job { Name = "RestoreFiles" .... change the Where to point to where you want to store restor your files. Default Where=/tmp/bacula-restore
Under the section Catalog { Name = MyCatalog dbname = bacula; user = "bacula"; password = "secretpassword".... make sure it has the correct information. If the you don't have database created use the script supplied to install it. mysql -u bacula -p < /usr/share/bacula-director/make_mysql_tables
Now Start bacula, see if it starts, check the error log to make sure you didn't miss anything.
/etc/init.d/bacula-director start or /etc/init.d/bacula-director restart cat /var/log/bacula/log
If you get an error in the log, fix it in configuration.
Storage Daemon
- Storage Daemon bacula-sd.conf controls which director can talk to it, where and what device will it store the files
Make sure the following changes are done:
- All your password match to director configuration
- If you use networking that Address=127.0.0.1 is commented out.
Under the section Device { Name = FileStorage Media Type = File Archive Device ... change the Archive Device to where you will store your files. If you are like me and have software RAID5 setup on 4 500GB drives mounted to a /home directory totaling 1.3 TB available space, you can set it to Archive Device = /home/bacula/backups
Here is the place to set other devices. Look at the configuration file for examples on how to set: DDS-4, OnStrem, DVD-Writer, Exbyte 8mm,
- Make sure the bacula has file permissions to write to the folder.
mkdir -p /home/bacula/backups chown -R bacula:bacula /home/bacula
Start the bacula storage daemon
/etc/init.d/bacula-sd start or /etc/init.d/bacula-sd restart cat /var/logs/bacula/logs
If you get permission, login/password error in the log, fix it in configuration file.
When you are done with the setup and everything wokrs make sure you read Volume Management Section. We need to setup multiple volumes for our backup so that the rotation happens.
File Deamon
FileDaemon bacula-fd.conf needs to be installed on all clients that you want to backup. So bacula-fd program has to reside on a computer you want to backup.
Make sure the following changes are done:
- If you use networking that Address=127.0.0.1 is commented out.
- Make sure passwords match what directors conf file.
If you can not resolv the hostname of the computer make sure you add the appropriate computer names to /etc/hosts file.
Start the bacula file deamon
/etc/init.d/bacula-fd start or /etc/init.d/bacula-fd restart cat /var/logs/bacula/logs
If you get permission, login/password error in the log, fix it in configuration file.
Console
- You have an option on few consoles that that you can control bacula with.
Here is a list:
- bconsole
- bacula-console-gnome
- bacula-console-wx
- bacula-console-qt
The conf files are located in /etc/bacula/ make sure the console has the password as it is defined in director.
- Run bconsole to make sure it can connect
sudo bconsole Connecting to Director servername:9101 1000 OK: servername-dir Version: 1.38.11 (28 June 2006) Enter a period to cancel a command. *exit
Here is a visual representation on how the config files are connected:
Troubleshooting Connection
ip settings
- First thing to do is to telnet into 9102 and see if you connect or the connection gets rejected.
- Connect to director server, then to client where you have fd installed. If you don't get the following then your ip settings are incorrect.
telnet server1 9102 Trying 192.168.1.68... Connected to server1.local. Escape character is '^]'. quit
- Try each of these to find where you problem is. Telnet into the following:
telnet 127.0.0.1 9102 (localhost) telnet 192.168.1.123 9102 (local ip address) telnet servername1 9102 (local servername)
- Follow the same strategy to connect to your sd, and fd clients.
- If it only works on 127.0.0.1 You need to comment the address lines.
#DirAddress = 127.0.0.1 #SDAddress = 127.0.0.1 #FDAddress = 127.0.0.1
- If local ip address works you need to add servername to hosts file or enable wins support in /etc/nsswitch.conf
ping servername1
- If you can't ping them then you probobly need to add their address to a hosts file.
vi /etc/hosts
add this line (replace ip address with yours)
192.168.1.123 servername1 192.168.1.234 servernamefd
Manage bacula
- Start bacula
/etc/init.d/bacula-dir start /etc/init.d/bacula-sd start /etc/init.d/bacula-fd start
- Restart bacula
/etc/init.d/bacula-dir restart /etc/init.d/bacula-sd restart /etc/init.d/bacula-fd restart
Pools, volumes, lables
- I think bacula documentation can explains it best:
- "If you have been using a program such as tar to backup your system, Pools, Volumes, and labeling may be a bit confusing at first. A Volume is a single physical tape (or possibly a single file) on which Bacula will write your backup data. Pools group together Volumes so that a backup is not restricted to the length of a single Volume (tape). Consequently, rather than explicitly naming Volumes in your Job, you specify a Pool, and Bacula will select the next appendable Volume from the Pool and request you to mount it."
"The steps for creating a Pool, adding Volumes to it, and writing software labels to the Volumes, may seem tedious at first, but in fact, they are quite simple to do, and they allow you to use multiple Volumes (rather than being limited to the size of a single tape). Pools also give you significant flexibility in your backup process. For example, you can have a "Daily" Pool of Volumes for Incremental backups and a "Weekly" Pool of Volumes for Full backups. By specifying the appropriate Pool in the daily and weekly backup Jobs, you thereby insure that no daily Job ever writes to a Volume in the Weekly Pool and vice versa, and Bacula will tell you what tape is needed and when." Seperate the pools
View this tutorial on how to get started with using bacula: Tutorial Chapter
- You also have an option to tell bacula to create and label volumes for you. You can tell it how many volumes you want and what should be theirs maximum size. Check the volume management section to set these settings.
bconsole
Start bconsole and type in help:
bconsole help Command Description ======= =========== add add media to a pool autodisplay autodisplay [on|off] -- console messages automount automount [on|off] -- after label cancel cancel [<jobid=nnn> | <job=name>] -- cancel a job create create DB Pool from resource delete delete [pool=<pool-name> | media volume=<volume-name>] disable disable <job=name> -- disable a job enable enable <job=name> -- enable a job estimate performs FileSet estimate, listing gives full listing exit exit = quit gui gui [on|off] -- non-interactive gui mode help print this command list list [pools | jobs | jobtotals | media <pool=pool-name> | files <jobid=nn>]; from catalog label label a tape llist full or long list like list command messages messages mount mount <storage-name> prune prune expired records from catalog purge purge records from catalog python python control commands quit quit query query catalog restore restore files relabel relabel a tape release release <storage-name> reload reload conf file run run <job-name> status status [storage | client]=<name> setdebug sets debug level setip sets new client address -- if authorized show show (resource records) [jobs | pools | ... | all] sqlquery use SQL to query catalog time print current time trace turn on/off trace to file unmount unmount <storage-name> umount umount <storage-name> for old-time Unix guys update update Volume, Pool or slots use use catalog xxx var does variable expansion version print Director version wait wait until no jobs are running [<jobname=name> | <jobid=nnn> | <ujobid=complete_name>] When at a prompt, entering a period cancels the command.
using bacula
- Run bconsole then type:
show filesets
show filesets
- You should see:
FileSet: name=Full Set O M N I /home/myusername/ I /etc/ N E /proc E /tmp E /.journal E /.fsck N
- I- Include
- E- Exclude
- O- Options
status dir
- Type in status dir
status dir
And you will see:
Level Type Pri Scheduled Name Volume =================================================================================== Incremental Backup 10 01-Aug-08 23:05 server1 bacula20080801 Incremental Backup 10 01-Aug-08 23:05 server2 bacula20080801 Full Backup 11 01-Aug-08 23:10 BackupCatalog bacula20080801 ==== Running Jobs: Console connected at 01-Aug-08 12:20 No Jobs running. ==== Terminated Jobs: JobId Level Files Bytes Status Finished Name ======================================================================== 1 Full 11,651 5,808,650,275 OK 01-Aug-08 11:01 server1 3 Full 63,930 2,651,212,530 OK 01-Aug-08 12:14 server2
status client
Lets get a status on the client
status client
Pick the client and you will see his status, jobs run etc.
status client The defined Client resources are: 1: server1 2: server2 Select Client (File daemon) resource (1-2): 2 Connecting to Client server2-fd at server2:9102 server2-fd Version: 1.38.11 (28 June 2006) i486-pc-linux-gnu debian 4.0 Daemon started 01-Aug-08 12:01, 1 Job run since started. Terminated Jobs: JobId Level Files Bytes Status Finished Name ====================================================================== 3 Full 63,930 2,651,212,530 OK 01-Aug-08 12:14 server2 ==== Running Jobs: Director connected at: 01-Aug-08 12:53 No Jobs running.
status storage
Lets find out what is the status on our storage
status storage
You should see:
Automatically selected Storage: File Connecting to Storage daemon File at server1:9103 server1-sd Version: 1.38.11 (28 June 2006) x86_64-pc-linux-gnu debian 4.0 Daemon started 01-Aug-08 09:19, 3 Jobs run since started. Running Jobs: No Jobs running. ==== Jobs waiting to reserve a drive: ==== Terminated Jobs: JobId Level Files Bytes Status Finished Name ====================================================================== 1 Full 11,651 5,810,545,052 OK 01-Aug-08 11:01 server1 2 Full 63,930 2,660,294,477 OK 01-Aug-08 12:14 server2 ==== Device status: Device "FileStorage" (/home/bacula/backups) is not open or does not exist. ==== In Use Volume status: ====
list jobs
- You should forward all emails from the server to you account, but if you want to see a list of jobs type:
list jobs
- You should see:
1 | server2 | 2008-08-01 10:57:43 | B | F | 11,651 | 5,808,650,275 | T | | 2 | server2 | 2008-08-01 11:42:43 | B | F | 0 | 0 | R | | 3 | server3 | 2008-08-01 12:05:10 | B | F | 63,930 | 2,651,212,530 | T | | 4 | server2 | 2008-08-01 13:05:39 | B | I | 12 | 125,049 | T | | 5 | server1 | 2008-08-01 13:53:08 | B | F | 5,900 | 127,019,988 | T | | 6 | server1 | 2008-08-01 13:53:39 | B | F | 5,900 | 127,019,988 | T | | 7 | server2 | 2008-08-01 23:05:04 | B | I | 17 | 128,228 | T | | 8 | server3 | 2008-08-01 23:05:14 | B | I | 888 | 49,182,024 | T | | 9 | server1 | 2008-08-01 23:05:45 | B | I | 2 | 15,433 | T |
list volumes
- To list volumes run this command:
list volumes
- You should see:
Pool: Default +---------+-------------+-----------+----------------+----------+--------------+---------+------+-----------+-----------+---------------------+ | MediaId | VolumeName | VolStatus | VolBytes | VolFiles | VolRetention | Recycle | Slot | InChanger | MediaType | LastWritten | +---------+-------------+-----------+----------------+----------+--------------+---------+------+-----------+-----------+---------------------+ | 3 | Volumes0001 | Full | 26,843,504,883 | 6 | 2,678,400 | 1 | 0 | 0 | File | 2008-12-08 19:49:05 | | 4 | Volumes0002 | Full | 26,843,504,948 | 6 | 2,678,400 | 1 | 0 | 0 | File | 2008-12-09 03:25:59 | | 5 | Volumes0003 | Append | 24,754,608,236 | 5 | 2,678,400 | 1 | 0 | 0 | File | 2008-12-09 14:18:28 | +---------+-------------+-----------+----------------+----------+--------------+---------+------+-----------+-----------+---------------------+
run
To run a job type in:
run Using default Catalog name=MyCatalog DB=bacula A job name must be specified. The defined Job resources are: 1: server1 2: BackupCatalog 3: RestoreFiles Select Job resource (1-3):
If you select 1 you will be asked:
Run Backup job JobName: Client1 FileSet: Full Set Level: Incremental Client: rufus-fd Storage: File Pool: Default When: 2003-04-28 14:18:57 OK to run? (yes/mod/no):
stop/delete/cancell jobs
If there is need to stop a job from running you can do the following:
- log into bconsole
Issue list jobs to see which one is running R.
Issue cancell and tell it which jobid to cancel cancel jobid=59, you can also use jobname cancel job=myjobname
If you want a job to start again you can do: status dir, get a list of currently scheduled jobs, then for each job execute run yes job=$host-backup.
In the last resort where you need to cancel jobs NOW because soemthing is going wrong, run
/etc/init.d/bacula-dir stop /etc/init.d/bacula-dir start or /etc/init.d/bacula-dir restart
prune/purge/delete
- If you set up your volumes properly the automatic rotation should be happening if recycle options are set and retention value is also properly set.
- If for some reason you need to manually delete the volume here is what you do:
- List the volumes and find out if its set to full,
list volumes
If the VolStatus is Append or Recycle is set, the volume will be used.
- If Recycle Current Volume is set and the volume is marked Full or Used, Bacula will prune the volume (applying the retention period). If all Jobs are pruned from the volume, it will be recycled.
If the VolStatus is Append or Recycle change it to Full or Used. Use update command, pick volume parameters and then volume status:
update
- When done issue prune command which will obey retention settings or purge command which will not obey retention settings and it will delete what you tell it to.
- Please be aware that purging or pruning doesn't delete actual data. It only deletes the catalog data and allows the volume to be recycled later.
- Also be aware if you use purge and you only have one client/volume/pool bacula will automatically select that client/volume/pool so be careful what you type.
Always be explcit and say exactly what you want to purge purge volume=myoldunusedvolume
Options are:
purge files jobid=<jobid>|job=<job-name>|client=<client-name> purge jobs client=<client-name> (of all jobs) purge volume|volume=<vol-name> (of all jobs)
- In my case I want to remove my old volume like this:
purge volume=myoldvolumename
- Then I am done with this volume and I want to delete it:
delete volume=myoldvolumename
- Then open a console file and delete the actual file.
cd /home/bacula/backup/ rm myoldvolumename
- If you know you want to delete a volume you can just issue a delete command and that will removes all records from the volume that are in the database, including the file and job records. So purging is not necessary.
restore
- Its important that you do few trial backups and restore few files before you move on to. Its crucial that you know what the process will look like.
restore all
To restore run
restore all
You should see:
First you select one or more JobIds that contain files to be restored. You will be presented several methods of specifying the JobIds. Then you will be allowed to select which files from those JobIds are to be restored. To select the JobIds, you have the following choices: 1: List last 20 Jobs run 2: List Jobs where a given File is saved 3: Enter list of comma separated JobIds to select 4: Enter SQL list command 5: Select the most recent backup for a client 6: Select backup for a client before a specified time 7: Enter a list of files to restore 8: Enter a list of files to restore before a specified time 9: Find the JobIds of the most recent backup for a client 10: Find the JobIds for a backup for a client before a specified time 11: Enter a list of directories to restore for found JobIds 12: Cancel Select item: (1-12):
- Select 5 and select the client.
Then bacula will show you the filesystem it has. You can browse it with ls, cd. When done browsing you type in done and you will be asked if you want to run this job.
done Bootstrap records written to /var/lib/bacula/server1-dir.1.restore.bsr The job will require the following Volumes: bacula20080801 63930 files selected to be restored. Run Restore job JobName: RestoreFiles Bootstrap: /var/lib/bacula/server1-dir.1.restore.bsr Where: /home/bacula/restore Replace: always FileSet: Full Set Client: server1-fd Storage: File When: 2008-08-01 13:13:32 Catalog: MyCatalog Priority: 10 OK to run? (yes/mod/no):
- At This point you could tell it yes and it will restore the files to the default folder you specified in the director.conf.
restore select
- To restore selected files do:
restore select
- Select client you want to restore
- Navigate to the directory you want to restore:
- Mark files to restore
cd /home/lucas/ mark myimportantfolder exit
You will be asked yes/mod/no. If you want to restore it to a different client, different folder, or just overwrite the files that are in the client type mod and change the parameters.
messages
- If you want to see messages run
messages or autodisplay on
restore file from x date
- The easiest way to restore from certain backup is to:
list jobs
- Look at the date the backup was taken and pull that job id.
restore select jobid=1242
- mark the files you want to restore, and you are done.
To view the rest of the console commands see Bacula Console
Backups
Incremental vs Differential
- Bacula has 3 different backup types.
full (complete dumps) differential (files changed since last full backup) incremental (changed files since the last backup of any sort) backups
VMware Images
If you are using vmware server you need to stop/suspend vmware and then make a backup of the snapshot.VMWare Bacula Backup
- If you are using ESX you are able to do a snapshot while the server is tunning.
Repair Mysql Tables
- After big power outage that spanned for 3 days and caused the computer to shut down at least 3 times at night the mysql table got corrupted because it wasn't closed properly. Since all the emails from my backup machine are going into our admin group I am notified about it right away.
- Log into mysql
mysql -u bacula -p
- Set the database
use bacula;
Check tables:
check table BaseFiles; check table CDImages; check table Client; check table Counters; check table Device; check table File; check table FileSet; check table Filename; check table Job; check table JobMedia; check table Media; check table MediaType; check table Path; check table Pool; check table Status; check table Storage; check table UnsavedFiles; check table Version;
Repair Tables:
repair table BaseFiles; repair table CDImages; repair table Client; repair table Counters; repair table Device; repair table File; repair table FileSet; repair table Filename; repair table Job; repair table JobMedia; repair table Media; repair table MediaType; repair table Path; repair table Pool; repair table Status; repair table Storage; repair table UnsavedFiles; repair table Version;
- [Optional] Alternative to fixing each table manually is to run the following command that will fix the all MyISAM tables.
mysqlcheck --repair --all-databases -p or mysqlcheck -u bacula -p --auto-repair=1 bacula
Volume Management
Limiting the volume size
- By limiting the volume size and number of volumes you can allow rotation of volumes, and keep you backup volume at a constant level.
Breaking up the volumes helps with restoration when a catalog is not available (avoids rescanning a huge file).
Limit your volume size to about 10%-15% of the HD capacity. IE, 1TB drive, volume size 100GB. And then set your max volume on the pool to ((HD space/volume space) - 1) so you don't need to worry about a full HD.
In the director change the default pool settings. Below I have you an example of 20 volumes with 25GB size totaling 500GB. When all volumes are filled the oldest one will get recycled and overwritten. Modify this section to fit you size requirements and retention. Be aware that retention starts when a volume is full.:
'Automatic recycling of Volumes is performed by Bacula only when it wants a new Volume and no appendable Volumes are available in the Pool. It will then search the Pool for any Volumes with the Recycle flag set and whose Volume Status is Full. At that point, the recycling occurs in two steps. The first is that the Catalog for a Volume must be purged of all Jobs and Files contained on that Volume, and the second step is the actual recycling of the Volume. The Volume will be purged if the VolumeRetention period has expired. When a Volume is marked as Purged, it means that no Catalog records reference that Volume, and the Volume can be recycled.'
- Add the last 3 settings to your Pool in a directors config file so it looks as follows:
Pool { Name = Default Pool Type = Backup Recycle = yes # Bacula can automatically recycle Volumes AutoPrune = yes # Prune expired volumes Volume Retention = 31 days # one year Accept Any Volume = yes # write on any volume in the pool Maximum Volumes = 20 Maximum Volume Bytes = 25G #25 GB Label Format = Volumes #(should I pick a different name/ format)
- You also have options on naming you volumes, for example in above the first volumes will be named volumes0001, Volumes0002,..
- You can specify names like:
Label Format = "${Pool}_${Year}-${Month:p/2/0/r}-${Day:p/2/0/r}_${Hour:p/2/0/r}h${Minute:p/2/0/r}m"
Also make sure that in you bacula sd config file the Storage device (Name=FileStorage) has the following as one of its options. This will let bacula lable the volume on its own without user interaction:
LabelMedia = yes; # lets Bacula label unlabeled media
- After the first volume fills you should see something like this:
User defined maximum volume capacity 26,843,545,600 exceeded on device "FileStorage" (/home/bacula/backups End of medium on Volume "Volumes0001" Bytes=26,843,504,883 Blocks=416,105 at 08-Dec-2008 19:49. Created new Volume "Volumes0002" in catalog. Labeled new Volume "Volumes0002" on device "FileStorage" (/home/bacula/backups). Wrote label to prelabeled Volume "Volumes0002" on device "FileStorage" (/home/bacula/backups) New volume "Volumes0002" mounted on device "FileStorage" (/home/bacula/backups) at 08-Dec-2008 19:49.
Adding Clients
Linux
- On linux Client you just need to install bacula-fd (File Deamon) and tell it where the director is.
On you backup server you need to specify the Client, Job and FileSet (if different from default)
- Here is a sample code for client #2
#---------Clients ------------- #Setup client to backup part1 Client { Name = server2-fd Address = server2 FDPort = 9102 Catalog = MyCatalog Password = "mypassword" # password for FileDaemon 2 File Retention = 30 days # 30 days Job Retention = 6 months # six months AutoPrune = yes # Prune expired Jobs/Files } #Second Job for client 2 part 2 Job { Name = "server2" Client = server2-fd FileSet = "Full Set server2" JobDefs = "DefaultJob" Write Bootstrap = "/var/lib/bacula/server2.bsr" } #Fileset for clinet2 part 3 # List of files to be backed up FileSet { Name = "Full Set server2" Include { Options { signature = MD5 } # # Put your list of files here, preceded by 'File =', one per line # or include an external list with: # # #File = /home/jgoerzen/work/bacula-1.38.11/debian/tmp-build-sqlite File = /usr/local/pythonenv File = /usr/local/turbogears File = /usr/local/src/ File = /var/www/ } # # If you backup the root directory, the following two excluded # files can be useful # Exclude { File = /proc File = /tmp File = /.journal File = /.fsck } }
Windows
Install Windows bacula version which supports: Microsoft Windows: Win98, WinMe, WinXP, WinNT, Win2003, and Win2000
See bacula windows notes for compatibility and permission issues: Bacula for Windows
# # Default Bacula File Daemon Configuration file # # For Bacula release 1.38.10 (08 June 2006) -- cygwin 1.5.18(0.132/4/2) # # There is not much to change here except perhaps the # File daemon Name to # # # List Directors who are permitted to contact this File daemon # Director { Name = server1-dir Password = "mypassword" } # # "Global" File daemon configuration specifications # FileDaemon { # this is me Name = windowsserver2-fd FDport = 9102 # where we listen for the director WorkingDirectory = "c:/bacula/working" Pid Directory = "c:/bacula/working" } # Send all messages except skipped files back to Director Messages { Name = Standard director = server1-dir = all, !skipped }
Performance
Initial Setup
- With the default setup, after setting up 3 linux servers to be backed up, and 3 windows 2000 servers here are some statistics.
Days = ~20 Bacula Volume = 360GB Biggest Full Backup: FD Bytes Written: 63,668,167,347 (63.66 GB) SD Bytes Written: 63,983,912,475 (63.98 GB) Rate: 771.2 KB/s Software Compression: 82.3 %
Others have expressed that they use bacula for: "46 clients, 12400 jobs, and 4.7 million files" and more.
Its been suggested that three settings that can have an impact on performance: VSS, compression and Maximum Network Buffer Size.
Increase Windows speed
- You have 2 options. You either lower the max buffer size on bacula to 32K which will slow down backup of linux machines and possible the speed of writing to DLT tape, or you increase the windows network buffer size to 64K so it matches bacula, and other OS.
Option 1
To increase performance on some windows machines you might need to set Maximum Network Buffer Size = <bytes>
- "Please use care in setting this value since if it is too large, it will be trimmed by 512 bytes until the OS is happy, which may require a large number of system calls. The default value is 65,536 bytes. Note, on certain Windows machines, there are reports that the transfer rates are very slow and this seems to be related to the default 65,536 size. On systems where the transfer rates seem abnormally slow compared to other systems, you might try setting the Maximum Network Buffer Size to 32,768 in both the File daemon and in the Storage daemon. If a Windows machine is so slow as you describe I would try to set Maximum Network Buffer Size = 32768 in the fd-conf of this machine. After this you have to restart the bacula service."
Option 2
The primary TCP tuning parameters appear in the registry under HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters. On the Edit menu, point to New, and then click DWORD Value. Type GlobalMaxTcpWindowSize in the New Value box, and then press Enter Click Modify on the Edit menu. Type the desired window size in the Value data box. Note. The valid range for window size is 0-0x3FFFC000 Hexadecimal. System Key: [HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\Tcpip\Parameters] Value Name: GlobalMaxTcpWindowSize Data Type: REG_DWORD (DWORD Value) Value Data: 0–0x3FFFFFFF Set the windows size to: 65536
You could also use: DR tcp
Hardware RAID5 vs Software RAID5
- In this documentation we are using Software RAID 5.
- Also don't forget to check you computer and motherboard to see if it supports so many hard drives. If you are planning to use more then four software raid harddrives you need a space for it in a PC box, and you motherboard needs more then four connectors.
- If you are thinking about hardware raid decide if you will have a need to use SAS connectors (serial attached SCSI) which on hardware cars can support SAS and SATA connectors/drives or SATA connector which supports only SATA connectors/drives.
Disk speed
Disk Usage
iostat -d -x 5
Press ctrl + C to stop
ctrl + c
Pay special attention to await and util.
Linux 2.6.18-53.1.4.el5 (your.servername.com) 12/17/2009 Linux 2.6.26-2-amd64 (uicesv10) 02/18/2010 _x86_64_ Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util sda 0.02 7.79 0.29 1.90 18.62 83.85 46.91 0.01 4.79 0.80 0.17 sda1 0.00 0.00 0.00 0.00 0.01 0.00 63.53 0.00 2.84 2.34 0.00 sda2 0.02 7.79 0.29 1.90 18.61 83.85 46.91 0.01 4.79 0.80 0.17 dm-0 0.00 0.00 0.08 9.60 1.77 76.79 8.12 0.05 5.03 0.11 0.10 dm-1 0.00 0.00 0.01 0.04 0.11 0.32 8.00 0.00 14.75 0.19 0.00 dm-2 0.00 0.00 0.21 0.05 16.73 6.74 89.38 0.00 4.25 2.81 0.07
Here are the definitions
* rrqm/s : The number of read requests merged per second that were queued to the hard disk * wrqm/s : The number of write requests merged per second that were queued to the hard disk * r/s : The number of read requests per second * w/s : The number of write requests per second * rsec/s : The number of sectors read from the hard disk per second * wsec/s : The number of sectors written to the hard disk per second * avgrq-sz : The average size (in sectors) of the requests that were issued to the device. * avgqu-sz : The average queue length of the requests that were issued to the device * await : The average time (in milliseconds) for I/O requests issued to the device to be served. This includes the time spent by the requests in queue and the time spent servicing them. * svctm : The average service time (in milliseconds) for I/O requests that were issued to the device * %util : Percentage of CPU time during which I/O requests were issued to the device (bandwidth utilization for the device). Device saturation occurs when this value is close to 100%.
First you need to note down following values from the iostat output: 1. The average service time (svctm) 2. Percentage of CPU time during which I/O requests were issued (%util) 3. See if a hard disk reports consistently high reads/writes (r/s and w/s) If any one of these are high, you need to take one of the following action: * Get high speed disk and controller for file system (for example move from SATA I to SAS 15k disk) * Tune software or application or kernel or file system for better disk utilization * Use RAID array to spread the file system If utilization is over 50%, its time to start looking at how to distribute the work load.
- You can install the following package that will let you read performance of your hard drives.
aptitude update aptitude install sysstat
- This package comes with few useful programs: sar, sadf, mpstat, iostat, pidstat and sa tools.
- The one we will use is iostat. Run this command:
iostat -m 5
- The m tells it to display output in megabytes, 5 tells it to refresh every 5sec.
- If you know which array you want to watch run
iostat -m 5 /dev/md5
- You can also use this commands to see IO speeds on you drives:
iostat -x 1
- Look at avgqu (Average I/O queue)
- Few more usefull commands:
sar -P ALL 1 0 vmstat
- [Optional] Also the hdparm program which is more powerfull and therefore more dangerous.
aptitude install hdparm #Perorm read timing.This will display read performance. hdparm -t /dev/md5
Some statistics on block size of the harddrive
- To test the speed of you harddrives your try the following:
- In this test you will write a bigfile to your raid5 or 6 array at different strip sizes. The strip size that fits your hdd will give you the highest speed.The flip size is that if you don't get the right number then your IOwait time will increase and system will wait longer for the harddrive to take action. If the number is too low then you are not utilizing your IO speed. Change your strip size and watch iostat -k 5 and monitor you IOwait time.
- This writes from /dev/zero to a file called /home/lucas/bigfile; try any of these commands and see what is the write speed.
dd if=/dev/zero of=/home/lucas/bigfile bs=64k count=8192 dd if=/dev/zero of=/home/lucas/bigfile bs=128k count=8192 dd if=/dev/zero of=/home/lucas/bigfile bs=256k count=8192 dd if=/dev/zero of=/home/lucas/bigfile bs=512k count=8192 dd if=/dev/zero of=/home/lucas/bigfile bs=1024k count=8192 dd if=/dev/zero of=/home/lucas/bigfile bs=2048k count=8192 dd if=/dev/zero of=/home/lucas/bigfile bs=4096k count=8192
- If you want to experiment with block size, find out what is your current default size 64K usually.
- Its been suggested "that bigger stripe sizes (than the default 64k) give you worse write performance and higher iowait but better read performance on large files."
cat /proc/mdstat cat /sys/block/md0/md/stripe_cache_size echo 1024 > /sys/block/md1/md/stripe_cache_size
Example
- Here is the example that shows what you need to test. You can perform this test on a live server it does not cause any problems.
#cpu information cat /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 15 model : 65 model name : Dual-Core AMD Opteron(tm) Processor 2214 stepping : 3 cpu MHz : 2200.009 cache size : 1024 KB #memory size free total used free shared buffers cached Mem: 8184404 4068088 4116316 0 20448 3156488 -/+ buffers/cache: 891152 7293252 Swap: 18321400 84 18321316 #Active strip size cat /sys/block/md5/md/stripe_cache_active 387 #Strip cache size in memory hold by OS. As you can see it holds 512 as a default. cat /sys/block/md5/md/stripe_cache_size 512 #Raid setup description. As you see my md5 is my raid5. cat /proc/mdstat Personalities : [raid1] [raid6] [raid5] [raid4] md5 : active raid5 sda2[0] sdd2[3] sdc2[2] sdb2[1] 1462211904 blocks level 5, 64k chunk, algorithm 2 [4/4] [UUUU] md0 : active raid1 sda1[0] sdd1[3] sdc1[2] sdb1[1] 979840 blocks [4/4] [UUUU] # Write a bigfile to /home/ folder which is on my raid5 array dd if=/dev/zero of=/home/lucas/bigfile bs=1M count=8192 8192+0 records in 8192+0 records out 8589934592 bytes (8.6 GB) copied, 265.931 seconds, 32.3 MB/s #Delete the file. rm /home/lucas/bigfile #Change the cache size echo 1024 > /sys/block/md5/md/stripe_cache_siz #Run command: dd if=/dev/zero of=/home/unique/bigfile bs=1M count=8192 #A little bit more performance then with the original settings. 8192+0 records in 8192+0 records out 8589934592 bytes (8.6 GB) copied, 160.36 seconds, 53.6 MB/s #Delete the file. rm /home/lucas/bigfile #Change the cache size echo 2048 > /sys/block/md5/md/stripe_cache_size dd if=/dev/zero of=/home/unique/bigfile bs=1M count=8192 8192+0 records in 8192+0 records out 8589934592 bytes (8.6 GB) copied, 81.7626 seconds, 105 MB/s #Delete the file. rm /home/lucas/bigfile #Change the cache size echo 4096 > /sys/block/md5/md/stripe_cache_size #Test now. dd if=/dev/zero of=/home/unique/bigfile bs=1M count=8192 8192+0 records in 8192+0 records out 8589934592 bytes (8.6 GB) copied, 82.7625 seconds, 104 MB/s #Delete the file. rm /home/lucas/bigfile #Change the cache size echo 8192 > /sys/block/md5/md/stripe_cache_size #Test now. dd if=/dev/zero of=/home/unique/bigfile bs=1M count=8192 8192+0 records in 8192+0 records out 8589934592 bytes (8.6 GB) copied, 64.9024 seconds, 132 MB/s
Backing up Databases
Mysql
- With this simple script you can backup mysql to a gz file each day. This script holds last 14 days of the backup for quick access. What is left to do is to add this script to a cron to run daily, and setup bacula to backup this folder with your regular settings. That way bacula has a copy of the backup files, and you have quick access in case you need to access thefile.
#Author: Lukasz Szybalski #License: LGPL #Feedback: szybalski@gmail.com #This program creates a mysqlbackup using mysqldump. It gzip the file via a pipe. It creats a backup file for each day it runs. There is a delete line that will delete a backup from today - 14 days. This script is intendent to run daily via cron. When done do cron -e and set the cron line to something like this. Run at 2am every day. Change the folder location,etc.: #0 2 * * * /home/trac/backup/mysql_backup.sh # modify the following to suit your environment export DB_BACKUP="/home/trac/backup/mysql_backup" export DB_USER="root" export DB_PASSWD="somepassword" if [ -d $DB_BACKUP ]; then echo "Folder exists, proceeding.." else mkdir -p $DB_BACKUP echo "Creating Folder '$DB_BACKUP'" fi # title and version echo "" echo "MySQL_backup on `hostname`" echo "----------------------" #Delete the file that is 14 days old. rm -f $DB_BACKUP/mysql-`hostname`-`date --date='14 days ago' '+%Y%m%d'`.gz #Create a new backup echo "* Creating new backup..." mysqldump --user=$DB_USER --password=$DB_PASSWD --all-databases | gzip > $DB_BACKUP/mysql-`hostname`-`date +%Y%m%d`.gz #mysqldump --user=$DB_USER --password=$DB_PASSWD --host=$DB_HOST --databases trac | gzip > $DB_BACKUP/mysql-`hostname`-trac-`date +%Y%m%d`.gz echo "----------------------" echo "Done" exit 0 #RESTORE Procedure (You should always write a restore procedure. Test it too) # mysql -u root -p #create user 'trac'@'%' IDENTIFIED BY 'somepass'; #create database trac; #GRANT ALL PRIVILEGES ON trac.* TO 'trac'@'%' WITH GRANT OPTION; #FLUSH PRIVILEGES; #Restore on command line #mysql -u trac -p database_name_to_restore_to <mysqldump_file_to_restore-20100215 #mysql -u trac -p claimtrac <mysql-trac-20100215
Offsite Storage
Amazon S3
Here is a script that reads bacula mysql database and uploads the files into amazon s3 servers for backup. Bacula with Amazon S3 backup
- You might need to devide the bacula volume in 5gb files.
- Add Maximum Block Size to your device configuration.
Maximum Block Size = nnn or Maximum Block Size = 5368709120 #5,368,709,120
Amazon S3cmd
- There is a command line tool that allows you to browse, copy, delete data onto you amazon s3 account.
aptitude update aptitude install s3cmd
- [Optional] Or download it and install it from source if its not available.
#Check the website for most current version wget http://superb-west.dl.sourceforge.net/sourceforge/s3tools/s3cmd-0.9.9.tar.gz tar -xzvf s3cmd-0.9.9.tar.gz cd s3cmd-0.9.9 python setup.py install
- If you get the following error:
No module named etree.ElementTree No module named elementtree.ElementTree Please install ElementTree module from http://effbot.org/zone/element-index.htm
- Do
aptitude install python-elementtree
- Now try to install it again
python setup.py install
Configure
- Now we need to configure it
s3cmd --configure
- Provide, access and secure key, you can use default answers for the rest of them if you choose to.
Access key: Secret key: Password for encryption Answer whether you want to use https Test if you can connect
Using s3cmd
List buckets each bucket can hold 5gb max.
s3cmd ls
Make a bucket
s3cmd mb s3://someuniquename.my-new-bucket-name
List the contents of the bucket
s3cmd ls s3://someuniquename.my-new-bucket-name #Example s3cmd ls s3://logix.cz-test Bucket 'logix.cz-test': .....
Upload a file into the bucket
s3cmd put addressbook.xml s3://logix.cz-test/addrbook.xml #or s3cmd put file-* s3://logix.cz-test/
- When you upload files they can be private or public. Public files can be accessed via amazon http service.
- Backups should be private so we would use this command:
s3cmd put --acl-private --guess-mime-type storage.jpg s3://logix.cz-test/storage.jpg
If you uploaded it as public you could see it in a brwoser: public uploaded file
Retrieve the file back and verify that its hasn't been corrupted
s3cmd get s3://logix.cz-test/addrbook.xml addressbook-2.xml #verify md5sum, you could potentially upload a md5sums to amazon and verify them them during restore. md5sum addressbook.xml addressbook-2.xml 39bcb6992e461b269b95b3bda303addf addressbook.xml 39bcb6992e461b269b95b3bda303addf addressbook-2.xml
Remove file, and bucket You can only remove empty buckets
s3cmd del s3://logix.cz-test/addrbook.xml s3://logix.cz-test/storage.jpg Object s3://logix.cz-test/addrbook.xml deleted Object s3://logix.cz-test/storage.jpg deleted s3cmd rb s3://logix.cz-test Bucket 'logix.cz-test' removed
If you have big monthly archives and you need to split it in 5 gb section you could specify few folders you upload. If you don't want to specify acl-private you can change that setting in vi ~/.s3cfg and set acl_public=False
s3cmd put --acl-private data_archives_20080[1,2,3,4].zip.gpg s3://logix.cz-test/data_archives_2008_part1/ s3cmd put --acl-private data_archives_20080[5,6,7,8].zip.gpg s3://logix.cz-test/data_archives_2008_part2/ s3cmd put --acl-private data_archives_2008[09,10,11,12].zip.gpg s3://logix.cz-test/data_archives_2008_part3/
Split Files into 5GB
- To Split files do:
split -b 5120m filename newfilename
- Example
split -b 5120m S_drive_shared.tar.gz.gpg S_drive_shared.tar.gz.gpg #will create S_drive_shared.tar.gz.gpgaa S_drive_shared.tar.gz.gpgab S_drive_shared.tar.gz.gpgac
- To combine them do:
cat S_drive_shared.tar.gz.gpg* >>S_drive_shared.tar.gz.gpg
- Don't forget to upload md5sum log to the amazonS3. After recovery check if the md5sum matches.
for i in *;do md5sum $i >>md5sum.log;done
List all files on amazon S3
- Sometimes you need to print a listing of all files in your amazonsS3. To do that issue command:
for i in `s3cmd ls |sed -e 's/^.\{18\}//'`;do s3cmd ls $i;done >>amazon_s3_20090515.txt
rsync and Encryption (gpg)
As far as encryption there are few things you can do. Parts of this can be used with bacula, and others for your archives which are no longer changing.
Big data filed that doesn't change. Encrypt the data, send it over to offsite storage. Read GNUPG for Backups to get you started. For introduction on gpg try these Into and batch script GNUPG and GNUPGP Encryption.
To encrypt often changing files and sync them with offsite storage you can use encfs, configure it expert configuration mode and then use s3cmd sync on the encrypted folder. With encfs you have an option to randomize filenames if you choose to.
Here is a quick guide to gnupg
- I will use gnupg to encrypt our huge archive document files from past 5 years. These archives don't change so I will encrypt them once, send them to amazon s3 (offsite storage). We will need these if we lose the originals.
- Install gnupg
aptitude update aptitude install gnupg
- Generate a key
gpg --gen-key
- List keys
gpg --list-keys gpg --list-secret-keys
- Encrypt the file (NAME is your name as you specified in key):
gpg -r NAME --output OUTFILE.gpg --encrypt INFILE.zip or gpg -r NAME --output --encrypt INFILE.zip
- The second command will append .gpg at the end.
- To decrypt use, you need to provide the output file:
gpg -r NAME --output OUTFILE --decrypt INFILE.gpg
- Export public and secret key. Secret key should only be known by you or your CEO.
To export a public key to an ascii text file, run: gpg -a --export NAME > yourpublickey.gpg To export a private (or secret) key to an ascii text file, run: gpg -a --export-secret-keys NAME > yourprivatekey.gpg
- Import secret key to a restore server if you have one. If not try on a different machine and see if you can decrypt one of your backup files.
gpg --import yoursecretkey.gpg #Change the trust level of this key to (5) ultimate gpg --edit-key NAME trust #(choose 5)
- Check the batch link above to creat a batch script to encrypt the whole backup folder. (I use it to ancrypt archives for past 5 year that are not changed.)
- When uploading to remote location don't forget to do md5sum for unencrypted and encrypted file, so that you can verity you are getting the same file back.
md5sum myarchivefile.zip md5sum myarchivefile.zip.gpg
Place your private key somewhere where only you can access it outside of the office in case your place where you keep your servers burns down.
- Batch Decrypt on the fly (After read password you would type in your password for security key):
aptitude install pyp read password for i in *.gpg;do echo "${password}"| gpg --passphrase-fd 0 -r NAME --output `ls $i|pyp p[:-4]` --decrypt $i;done
- I use pyp to replace somefile.zip.gpg to somefile.zip. p[:-4] means all lines until -4 character (.gpg)
#Other python3 pyp examples for i in data-2010-*.jpg;do mv -v "$i" `echo $i|pyp "p.replace(' ','').replace('-','')"`;done
Bacula and encryption
On Debian system openssl has some legal issues that prevent it from being in main pool of software, therefore if you want ssl/tls to be enabled you need to rebuild with these few easy steps. Legal issues
apt-get build-dep bacula aptitude install libssl-dev openssl aptitude source bacula cd bacula-2.... vim debian/rules # (change the options to enable ssl tls) debian/rules binary #it should now compile and pack all bacula packages cd .. dpkg -i bacula*.deb
References
http://www.bacula.org/presentations/Bacula-UKUUG-talk-20Feb08.pdf
- Debian Config Files in /etc/bacula/bacula-*.conf
http://www.bacula.org/en/dev-manual/Brief_Tutorial.html#TutorialChapter
http://www.bacula.org/en/dev-manual/Basic_Volume_Management.html
http://www.cyberciti.biz/tips/linux-disk-performance-monitoring-howto.html