Discussion:
[rsnapshot-discuss] Only include Users Mail-Dir?
s***@posteo.de
2016-02-24 07:30:23 UTC
Permalink
Hello,
I am responsible for doing User's home directory backups from a Server
that hosts ca. 100 Users.
Most of the users have just their maildirs and maybe a few Megabytes in
their $home.
But some Users have several hundreds of gigs, and are totally excluded
from backups.
Until now i do a ncdu (https://dev.yorhel.nl/ncdu) and add those that
spend more than 10 G by hand to a /etc/rsnapshot.excludes file.
How can I include _everybody's_ maildir, while still being able to
exclude all the junk and superfluous files.
I would rather not want to exclude all mp3, avi, mp4, wav, vob, etc.
There are just too much of them file endings.

My Question:Is there a maybe an include statement, that includes _all_
Mailbox directories in all user's $homes, while still otherwise honoring
the /etc/rsnapshot.excludes file?

any thoughts about this?

Dirk
Nico Kadel-Garcia
2016-02-24 13:07:30 UTC
Permalink
Post by s***@posteo.de
Hello,
I am responsible for doing User's home directory backups from a Server
that hosts ca. 100 Users.
Most of the users have just their maildirs and maybe a few Megabytes in
their $home.
But some Users have several hundreds of gigs, and are totally excluded
from backups.
Until now i do a ncdu (https://dev.yorhel.nl/ncdu) and add those that
spend more than 10 G by hand to a /etc/rsnapshot.excludes file.
How can I include _everybody's_ maildir, while still being able to
exclude all the junk and superfluous files.
I would rather not want to exclude all mp3, avi, mp4, wav, vob, etc.
There are just too much of them file endings.
My Question:Is there a maybe an include statement, that includes _all_
Mailbox directories in all user's $homes, while still otherwise honoring
the /etc/rsnapshot.excludes file?
any thoughts about this?
Dirk
Save yourself the work. Use a pre-rsnapshot command that talks to the
relevant system and publishes a list of "abusive" mail directories
into a flat text file. rsync over that file, *first*, to your
rsnapshot configuration files, and use "--exclude-from=FILENAME"
options to read that file for exclusions as an option before any
includes get listed.

You can reverse the logic and use includes, first, but it can get
pretty silly. And definitely beware of users who oscillate between
"small enough" and "too darn big", because they'll keep getting backed
up without the hard links.

You'll need to adjust the exclusions to handle the relative addressing
and not exclude other directories of matching names in other parts of
your target filesystems, but this is usually straightforward.
Thomas Fjellstrom
2016-02-27 15:07:41 UTC
Permalink
Post by Nico Kadel-Garcia
Post by s***@posteo.de
Hello,
I am responsible for doing User's home directory backups from a Server
that hosts ca. 100 Users.
Most of the users have just their maildirs and maybe a few Megabytes in
their $home.
But some Users have several hundreds of gigs, and are totally excluded
from backups.
Until now i do a ncdu (https://dev.yorhel.nl/ncdu) and add those that
spend more than 10 G by hand to a /etc/rsnapshot.excludes file.
How can I include _everybody's_ maildir, while still being able to
exclude all the junk and superfluous files.
I would rather not want to exclude all mp3, avi, mp4, wav, vob, etc.
There are just too much of them file endings.
My Question:Is there a maybe an include statement, that includes _all_
Mailbox directories in all user's $homes, while still otherwise honoring
the /etc/rsnapshot.excludes file?
any thoughts about this?
Dirk
Save yourself the work. Use a pre-rsnapshot command that talks to the
relevant system and publishes a list of "abusive" mail directories
into a flat text file. rsync over that file, *first*, to your
rsnapshot configuration files, and use "--exclude-from=FILENAME"
options to read that file for exclusions as an option before any
includes get listed.
That is an interesting method. I never thought of that, and might try it
myself.

I currently have a really long excludes file :D The core of it is this:

*.o
*.ko
*~
[#]*
*.so
*.so.*
*.a
*.bak
*.log
*.tar.*
*.tbz2
*.tgz
*.gz
*.iso
*.img
*.bin
*.vdi
*.ktr
*.deb
*.zip
*.7z
*.mkv
*.ogg
*.img.[1234567890]*
*.mov
*.mpg
*.tar
*.TAR
*.avi
*.flv
*.ogv
*.flac
*.mp4

# here's to hoping we have decent quality jpegs saved for gimp images
*.xcf

core
core.[1234567890]*
debug.txt
warn*.txt
.xsession-errors
Maildir/.INBOX.Spam/*/***

But there are many more items that are specific to my setup.

A more dynamic setup that detects large files and/or folders is a very
interesting way to go.

Question is, what is the best way to implement that? Ssh over and run a script
(or some find command?) to send back a list of things to exclude?
Post by Nico Kadel-Garcia
You can reverse the logic and use includes, first, but it can get
pretty silly. And definitely beware of users who oscillate between
"small enough" and "too darn big", because they'll keep getting backed
up without the hard links.
You'll need to adjust the exclusions to handle the relative addressing
and not exclude other directories of matching names in other parts of
your target filesystems, but this is usually straightforward.
----------------------------------------------------------------------------
-- Site24x7 APM Insight: Get Deep Visibility into Application Performance
APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month
Monitor end-to-end web transactions and take corrective actions now
Troubleshoot faster and improve end-user experience. Signup Now!
http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140
_______________________________________________
rsnapshot-discuss mailing list
https://lists.sourceforge.net/lists/listinfo/rsnapshot-discuss
--
Thomas Fjellstrom
***@fjellstrom.ca
Nico Kadel-Garcia
2016-02-27 15:47:43 UTC
Permalink
On Sat, Feb 27, 2016 at 10:07 AM, Thomas Fjellstrom
Post by Thomas Fjellstrom
Post by Nico Kadel-Garcia
Post by s***@posteo.de
Hello,
I am responsible for doing User's home directory backups from a Server
that hosts ca. 100 Users.
Most of the users have just their maildirs and maybe a few Megabytes in
their $home.
But some Users have several hundreds of gigs, and are totally excluded
from backups.
Until now i do a ncdu (https://dev.yorhel.nl/ncdu) and add those that
spend more than 10 G by hand to a /etc/rsnapshot.excludes file.
How can I include _everybody's_ maildir, while still being able to
exclude all the junk and superfluous files.
I would rather not want to exclude all mp3, avi, mp4, wav, vob, etc.
There are just too much of them file endings.
My Question:Is there a maybe an include statement, that includes _all_
Mailbox directories in all user's $homes, while still otherwise honoring
the /etc/rsnapshot.excludes file?
any thoughts about this?
Dirk
Save yourself the work. Use a pre-rsnapshot command that talks to the
relevant system and publishes a list of "abusive" mail directories
into a flat text file. rsync over that file, *first*, to your
rsnapshot configuration files, and use "--exclude-from=FILENAME"
options to read that file for exclusions as an option before any
includes get listed.
That is an interesting method. I never thought of that, and might try it
myself.
*.o
*.ko
*~
[#]*
*.so
*.so.*
*.a
*.bak
*.log
*.tar.*
*.tbz2
*.tgz
*.gz
*.iso
*.img
*.bin
*.vdi
*.ktr
*.deb
*.zip
*.7z
*.mkv
*.ogg
*.img.[1234567890]*
*.mov
*.mpg
*.tar
*.TAR
*.avi
*.flv
*.ogv
*.flac
*.mp4
Ouch. If you're backup up things like "/etc" or source controlled
subdirectories, you can get in real trouble when you start making
excessively log "ignore" files. You can use multiple exclude files, to
break this up into manageable chunks for different environments.
Post by Thomas Fjellstrom
A more dynamic setup that detects large files and/or folders is a very
interesting way to go.
Yup! It's not perfect, but it's potentially more adaptable.
Post by Thomas Fjellstrom
Question is, what is the best way to implement that? Ssh over and run a script
(or some find command?) to send back a list of things to exclude?
Depends on what you want. A pre-exec command can be really useful, but
it means that your remote SSH credentials need the ability to run
remote scripts with lots of privileges to parse user home directories.
That.... makes me really nervous. I tend to set up rsnapshot
connectons with the old "validate-rsync.sh" script as a forced command
tied to the SSH public key, with the script modified to permit only
read operations. ( http://troy.jdmz.net/rsync/index.html )

I'd activate a nightly "report oversized Maildir" script on the actual
mail server, and have *that* publish an updated "exclude" list. Than
I'd have rsnapshot sync that file from the remote server, with the
same credentials used to run rsnapshot over rsync.
Christopher Barry
2016-02-27 17:07:14 UTC
Permalink
On Sat, 27 Feb 2016 10:47:43 -0500
Post by Nico Kadel-Garcia
On Sat, Feb 27, 2016 at 10:07 AM, Thomas Fjellstrom
Post by Thomas Fjellstrom
Post by Nico Kadel-Garcia
Post by s***@posteo.de
Hello,
I am responsible for doing User's home directory backups from a
Server that hosts ca. 100 Users.
Most of the users have just their maildirs and maybe a few
Megabytes in their $home.
But some Users have several hundreds of gigs, and are totally
excluded from backups.
Until now i do a ncdu (https://dev.yorhel.nl/ncdu) and add those
that spend more than 10 G by hand to a /etc/rsnapshot.excludes
file. How can I include _everybody's_ maildir, while still being
able to exclude all the junk and superfluous files.
I would rather not want to exclude all mp3, avi, mp4, wav, vob,
etc. There are just too much of them file endings.
My Question:Is there a maybe an include statement, that includes
_all_ Mailbox directories in all user's $homes, while still
otherwise honoring the /etc/rsnapshot.excludes file?
any thoughts about this?
Dirk
Save yourself the work. Use a pre-rsnapshot command that talks to
the relevant system and publishes a list of "abusive" mail
directories into a flat text file. rsync over that file, *first*,
to your rsnapshot configuration files, and use
"--exclude-from=FILENAME" options to read that file for exclusions
as an option before any includes get listed.
That is an interesting method. I never thought of that, and might
try it myself.
*.o
*.ko
*~
[#]*
*.so
*.so.*
*.a
*.bak
*.log
*.tar.*
*.tbz2
*.tgz
*.gz
*.iso
*.img
*.bin
*.vdi
*.ktr
*.deb
*.zip
*.7z
*.mkv
*.ogg
*.img.[1234567890]*
*.mov
*.mpg
*.tar
*.TAR
*.avi
*.flv
*.ogv
*.flac
*.mp4
Ouch. If you're backup up things like "/etc" or source controlled
subdirectories, you can get in real trouble when you start making
excessively log "ignore" files. You can use multiple exclude files, to
break this up into manageable chunks for different environments.
Post by Thomas Fjellstrom
A more dynamic setup that detects large files and/or folders is a
very interesting way to go.
Yup! It's not perfect, but it's potentially more adaptable.
Post by Thomas Fjellstrom
Question is, what is the best way to implement that? Ssh over and
run a script (or some find command?) to send back a list of things
to exclude?
Depends on what you want. A pre-exec command can be really useful, but
it means that your remote SSH credentials need the ability to run
remote scripts with lots of privileges to parse user home directories.
That.... makes me really nervous. I tend to set up rsnapshot
connectons with the old "validate-rsync.sh" script as a forced command
tied to the SSH public key, with the script modified to permit only
read operations. ( http://troy.jdmz.net/rsync/index.html )
I was totally unaware that this could be done. Thanks!
Great linked site too - thanks.
Post by Nico Kadel-Garcia
I'd activate a nightly "report oversized Maildir" script on the actual
mail server, and have *that* publish an updated "exclude" list. Than
I'd have rsnapshot sync that file from the remote server, with the
same credentials used to run rsnapshot over rsync.
Exactly, have this happen on a cron remotely every hour or something.
Maybe even using inotify can be helpful here.
http://www.ibm.com/developerworks/linux/library/l-ubuntu-inotify/index.html

-C

Loading...