Discussion:
[rsnapshot-discuss] Enabling cmd_cp, link_dest and lazy_deletes on Linux?
dev
2016-03-10 14:38:50 UTC
Permalink
Hello,

My rsnapshot process has been running quite faithfully for years and am
looking for some ways to now trim some time off the process.

I noticed in the rsnapshot config file, 'cmd_cp' is commented out as
well as 'link_dest' and 'lazy_deletes'. Is there any harm in enabling
these features on a volume which did not have them enabled previously?

The current 'rm_rf' process is taking nigh of two hours right now, so
lazy deletes sounds like a great way to keep the process moving along.

PS: This is on a Linux 14.04 box running SATA RAID-10 (Areca hardware).

Thanks
rsnap shot
2016-03-10 14:53:19 UTC
Permalink
------------------------------------------------------------------------------
Transform Data into Opportunity.
Accelerate data analysis in your applications with
Intel Data Analytics Acceleration Library.
Click to learn more.
http://pubads.g.doubleclick.net/gampad/clk?id=278785111&iu=/4140
David Cantrell
2016-03-10 15:53:25 UTC
Permalink
Post by dev
My rsnapshot process has been running quite faithfully for years and am
looking for some ways to now trim some time off the process.
I noticed in the rsnapshot config file, 'cmd_cp' is commented out as
well as 'link_dest' and 'lazy_deletes'. Is there any harm in enabling
these features on a volume which did not have them enabled previously?
Enabling cmd_cp should be safe, although I doubt it will actually make
much difference - whether we use cp written in C or our own code written
in perl, it'll spend most of its time waiting for disk I/O.

Enabling lazy_deletes will be safe. It won't actually save any time, of
course, but will at least release the lock early.

link_dest is something I've never used, but the doco seems to indicate
that it's only really useful if you're using something other than Linux,
and ...
Post by dev
PS: This is on a Linux 14.04 box running SATA RAID-10 (Areca hardware).
... but someone else may be able to help more with this.
--
David Cantrell | Nth greatest programmer in the world

There are many different types of sausages. The best are
from the north of England. The wurst are from Germany.
-- seen in alt.2eggs...
rsnap shot
2016-03-10 19:41:27 UTC
Permalink
------------------------------------------------------------------------------
Transform Data into Opportunity.
Accelerate data analysis in your applications with
Intel Data Analytics Acceleration Library.
Click to learn more.
http://pubads.g.doubleclick.net/gampad/clk?id=278785111&iu=/4140
Scott Hess
2016-03-10 20:08:37 UTC
Permalink
Post by David Cantrell
Enabling lazy_deletes will be safe. It won't actually save any time, of
course, but will at least release the lock early.
Thanks. Regarding lazy_deletes, right now the server spends about two
hours waiting for rm_rf to remove the oldest snapshot directory. I was
under the impression that lazy_delete will instead 'mv' the directory out
of the way, and come back and delete it later. Is this incorrect?
Instead of deleting the directory then releasing the lock, it will release
the lock then delete the directory. The actual delete will take the same
amount of time, but the lock will be released earlier.

On the one hand, releasing the lock earlier may allow a later rsnapshot to
successfully get the lock, which sounds good. On the other hand, having a
later rsnapshot run in parallel with the earlier rsnapshot's delete means
they will be competing for limited I/O bandwidth, so the later rsnapshot
will run more slowly.

-scott
rsnap shot
2016-03-10 21:35:46 UTC
Permalink
------------------------------------------------------------------------------
Transform Data into Opportunity.
Accelerate data analysis in your applications with
Intel Data Analytics Acceleration Library.
Click to learn more.
http://pubads.g.doubleclick.net/gampad/clk?id=278785111&iu=/4140
Scott Hess
2016-03-10 22:13:23 UTC
Permalink
Post by Scott Hess
On the one hand, releasing the lock earlier may allow a later rsnapshot to
successfully get the lock, which sounds good. On the other hand, having a
later rsnapshot run in parallel with the earlier rsnapshot's delete means
they will be competing for limited I/O bandwidth, so the later rsnapshot
will run more slowly.
Drat. Short of commenting out rm_rf(
"$config_vars{'snapshot_root'}/_delete.$$" ); in the source, is there a way
to tell rsnapshot to leave the _delete.$$ dirs around for cron to clean up
later? I realize this isn't ideal as it doesn't free up disk space.
Which scenarios would that help? The delete isn't going to be faster later
on. Maybe you should describe the specific scenario which is causing you
to wish to turn this feature on?

Hmm. I forgot that lazy_deletes also moves the delete from before the
snapshot to after, so your snapshot is made closer to when you ran the
command. Here's a snippet from my rsnapshot.log:

[10/Mar/2016:12:00:37] /usr/bin/rsnapshot hourly: started
[10/Mar/2016:12:00:37] echo 3105 > /var/run/rsnapshot.pid
[10/Mar/2016:12:00:37] mv /.snapshots/hourly.3/ /.snapshots/_delete.3105/
[10/Mar/2016:12:00:37] mv /.snapshots/hourly.2/ /.snapshots/hourly.3/
[10/Mar/2016:12:00:37] mv /.snapshots/hourly.1/ /.snapshots/hourly.2/
[10/Mar/2016:12:00:37] mv /.snapshots/hourly.0/ /.snapshots/hourly.1/
[10/Mar/2016:12:00:37] /bin/cp -al /.snapshots/.sync /.snapshots/hourly.0
[10/Mar/2016:12:06:07] rm -f /var/run/rsnapshot.pid
[10/Mar/2016:12:06:07] /bin/rm -rf /.snapshots/_delete.3105
[10/Mar/2016:12:11:55] /usr/bin/rsnapshot hourly: completed successfully

Without lazy_deletes, the five minutes spent in the last rm would come
before the cp.

The place I found lazy_deletes to be most helpful is in the intervals past
the first one. For those, new snapshots are never created, they are only
stolen from earlier periods, like this:

[10/Mar/2016:03:45:01] /usr/bin/rsnapshot daily: started
[10/Mar/2016:03:45:01] echo 2530 > /var/run/rsnapshot.pid
[10/Mar/2016:03:45:01] mv /.snapshots/daily.6/ /.snapshots/_delete.2530/
[10/Mar/2016:03:45:01] mv /.snapshots/daily.5/ /.snapshots/daily.6/
[10/Mar/2016:03:45:01] mv /.snapshots/daily.4/ /.snapshots/daily.5/
[10/Mar/2016:03:45:01] mv /.snapshots/daily.3/ /.snapshots/daily.4/
[10/Mar/2016:03:45:01] mv /.snapshots/daily.2/ /.snapshots/daily.3/
[10/Mar/2016:03:45:01] mv /.snapshots/daily.1/ /.snapshots/daily.2/
[10/Mar/2016:03:45:01] mv /.snapshots/daily.0/ /.snapshots/daily.1/
[10/Mar/2016:03:45:01] mv /.snapshots/hourly.3/ /.snapshots/daily.0/
[10/Mar/2016:03:45:01] rm -f /var/run/rsnapshot.pid
[10/Mar/2016:03:45:01] /bin/rm -rf /.snapshots/_delete.2530
[10/Mar/2016:03:49:07] /usr/bin/rsnapshot daily: completed successfully

Once a week, my daily runs after weekly, in which case the rm doesn't
happen (since weekly stole the oldest daily). Without lazy_deletes, I had
to be careful to space out my cron jobs so that there was time to run the
delete pass before the next job was started. With lazy_deletes, both cases
are about the same, so I can bunch things up more. I still leave a
generous time buffer between the last non-hourly job and the first hourly
job, but I don't have to worry as much about timing.

-scott
rsnap shot
2016-03-10 22:28:36 UTC
Permalink
------------------------------------------------------------------------------
Transform Data into Opportunity.
Accelerate data analysis in your applications with
Intel Data Analytics Acceleration Library.
Click to learn more.
http://pubads.g.doubleclick.net/gampad/clk?id=278785111&iu=/4140
Scott Hess
2016-03-10 23:11:45 UTC
Permalink
Post by Scott Hess
Which scenarios would that help? The delete isn't going to be faster
later on. Maybe you should describe the specific scenario which is causing
you to wish to turn this feature on?
Apologies. I should have been clearer in my first post. I start rsnapshot
from cron around 2pm (for monthlies). I need to leave about 3 hours of time
in between. After the daily sync, I start an Amanda[1] process from
cmd_postexec which runs anywhere from 10am to 1pm the next day.
00 14 1 * * /usr/bin/rsnapshot monthly
00 17 * * 0 /usr/bin/rsnapshot weekly
00 20 * * * /usr/bin/rsnapshot daily
I noticed the rm process is taking 2 hours or more to complete not just
for each daily run, but weeklies and monthlies as well. What I'd like to
do, is wait to do any rm-ing until all rotations and amanda backups have
finished. This allows me to insert a new tape into the drive during working
hours and let the rm process happen way later. Yes, it will require lots of
disk space.
I suspect that if you used lazy_deletes, then you could put these closer
together. I'd guess that daily is your first interval? If so, perhaps you
could enable sync_first and lazy_deletes, then run things something like:
00 14 1 * * /usr/bin/rsnapshot monthly
10 14 * * 0 /usr/bin/rsnapshot weekly
20 14 * * * /usr/bin/rsnapshot sync && /usr/bin/rsnapshot daily

Here's my theory of operation ... the monthly may leave a lazy delete
running, but that won't impact the weekly much. After weekly there may be
two lazy deletes running, which will compete with the sync for I/O, but the
sync I/O will have some gaps waiting for rsync, and won't be as intensive
as the cp -al. Hopefully by the time the actual daily runs, the deletes
will have completed. Obviously this assumes that your sync takes an
appreciable amount of time compared to the rm -rf. Having rm -rf and cp
-al running at the same time is probably not optimal.

Or maybe not :-). You could replace cmd_rm with something to move the
deleted file into a .delete subdir, then have a job circle around later to
clean out that directory. This link:

http://serverfault.com/questions/183821/rm-on-a-directory-with-millions-of-files/328305#328305
has speculation on slow deletes and some workarounds (a trick using rsync
looks interesting), you could probably find other similar discussions on
Stack Overflow.

-scott
rsnap shot
2016-03-11 14:39:49 UTC
Permalink
------------------------------------------------------------------------------
Transform Data into Opportunity.
Accelerate data analysis in your applications with
Intel Data Analytics Acceleration Library.
Click to learn more.
http://pubads.g.doubleclick.net/gampad/clk?id=278785111&iu=/4140
David Cantrell
2016-03-15 13:47:05 UTC
Permalink
On Fri, Mar 11, 2016 at 08:39:49AM -0600, rsnap shot wrote: <div> </div><div> </div><div>10.03.2016, 17:11, "Scott Hess" &lt;***@doubleu.com&gt;:</div><blockquote type="cite"><div><div><div><br /><div> </div><div>...

Please don't post HTML-only emails to the list.
--
David Cantrell | top google result for "internet beard fetish club"

I think the most difficult moment that anyone could face is seeing
their domestic servants, whether maid or drivers, run away
-- Abdul Rahman Al-Sheikh, writing on 25 Jan 2004 at
http://www.arabnews.com/node/243486
Patrick O'Callaghan
2016-03-15 16:18:01 UTC
Permalink
<div> </div><div> </div><div>10.03.2016, 17:11, "Scott Hess" &
/><div> </div><div>...
Please don't post HTML-only emails to the list.
+1

Also, try not to top-post (I know most people don't, but still ...)

poc

David Cantrell
2016-03-10 22:27:41 UTC
Permalink
Post by Scott Hess
Drat. Short of commenting out rm_rf(
"$config_vars{'snapshot_root'}/_delete.$$" ); in the source, is there a
way to tell rsnapshot to leave the _delete.$$ dirs around for cron to
clean up later? I realize this isn't ideal as it doesn't free up disk space.
You could set cmd_rm to something that does nothing, like /bin/echo or
something like that.
--
David Cantrell | Cake Smuggler Extraordinaire

We found no search results for "crotchet". Did you mean "crotch"?
rsnap shot
2016-03-10 23:02:56 UTC
Permalink
------------------------------------------------------------------------------
Transform Data into Opportunity.
Accelerate data analysis in your applications with
Intel Data Analytics Acceleration Library.
Click to learn more.
http://pubads.g.doubleclick.net/gampad/clk?id=278785111&iu=/4140
David Cantrell
2016-03-10 22:24:18 UTC
Permalink
Post by David Cantrell
Post by David Cantrell
Enabling lazy_deletes will be safe. It won't actually save any time, of
course, but will at least release the lock early.
Thanks. Regarding lazy_deletes, right now the server spends about two
hours waiting for rm_rf to remove the oldest snapshot directory. I was
under the impression that lazy_delete will instead 'mv' the directory
out of the way, and come back and delete it later. Is this incorrect?
That is correct.
--
David Cantrell | Reality Engineer, Ministry of Information

You can't spell "slaughter" without "laughter"
c***@ccs.covici.com
2016-03-11 00:41:37 UTC
Permalink
link_dest is quite useful, if the rsync is interrupted, you can do it
again without having to rollback the backups.
Post by David Cantrell
Post by dev
My rsnapshot process has been running quite faithfully for years and am
looking for some ways to now trim some time off the process.
I noticed in the rsnapshot config file, 'cmd_cp' is commented out as
well as 'link_dest' and 'lazy_deletes'. Is there any harm in enabling
these features on a volume which did not have them enabled previously?
Enabling cmd_cp should be safe, although I doubt it will actually make
much difference - whether we use cp written in C or our own code written
in perl, it'll spend most of its time waiting for disk I/O.
Enabling lazy_deletes will be safe. It won't actually save any time, of
course, but will at least release the lock early.
link_dest is something I've never used, but the doco seems to indicate
that it's only really useful if you're using something other than Linux,
and ...
Post by dev
PS: This is on a Linux 14.04 box running SATA RAID-10 (Areca hardware).
... but someone else may be able to help more with this.
--
David Cantrell | Nth greatest programmer in the world
There are many different types of sausages. The best are
from the north of England. The wurst are from Germany.
-- seen in alt.2eggs...
------------------------------------------------------------------------------
Transform Data into Opportunity.
Accelerate data analysis in your applications with
Intel Data Analytics Acceleration Library.
Click to learn more.
http://pubads.g.doubleclick.net/gampad/clk?id=278785111&iu=/4140
_______________________________________________
rsnapshot-discuss mailing list
https://lists.sourceforge.net/lists/listinfo/rsnapshot-discuss
--
Your life is like a penny. You're going to lose it. The question is:
How do
you spend it?

John Covici
***@ccs.covici.com
Loading...