Discussion:
[linux-lvm] Segmentation fault in LVM2 latest version.
xiaowei
2014-03-11 08:34:38 UTC
Permalink
Hi all,

I can reproduce one segmentation fault in
lvm2(lvm2-2.02.100-8.el6.x86_64) with below steps:

1. create PV,VG with 9 disk, 4GB each:
pvcreate /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg
/dev/sdh /dev/sdi /dev/sdj
vgcreate vgmlpap /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf
/dev/sdg /dev/sdh /dev/sdi /dev/sdj
2.create a LV , say 2G initial, 6 strips.
lvcreate -L 2G -n lvpap007 -i 6 -I 128k vgmlpap
3.extend this LV several times with 1 strip:
lvextend -L +600M -i 1 /dev/mapper/vgmlpap-lvpap007 (repeat this
several times)
4.segmentation fault appear when the first physical volume can't meet
the required extends.


I tracked this a little more, and found that it's due to the strip 1
parameter and when the first PV leaves no enough extends, it need 2 area
to do the space allocation, but the area number was not extend , so when
it try to access the aa array in function _alloc_parallel_area of file
lib/metadata/lv_manip.c with a non exist index it segment fault! This lines:

aa[s].len = (ah->alloc_and_split_meta) ? len -
ah->log_len : len;
/* Skip empty allocations */
if (!aa[s].len)
continue;

aa[s].pv = pva->map->pv;
aa[s].pe = pva->start;

I am not an expert of LVM code , could you please give a quick fix of
this? It should be not hard:)


Please reply me with my email , I am not in the mail list.

Thanks,
Xiaowei
Alasdair G Kergon
2014-03-12 15:58:31 UTC
Permalink
Reproduced with upstream code.
Post by xiaowei
I tracked this a little more, and found that it's due to the strip 1
parameter and when the first PV leaves no enough extends, it need 2 area
It only needs 1 area - that part is correct - but it's still behaving as if
it's setting up an N-stripe extension where it pairs up the allocations with
the original stripes.

Alasdair
xiaowei
2014-03-14 01:05:09 UTC
Permalink
Hi Alasdair,

That's great you reproduced this bug:)
So the fix could be easy?

Thanks,
Xiaowei
Post by Alasdair G Kergon
Reproduced with upstream code.
Post by xiaowei
I tracked this a little more, and found that it's due to the strip 1
parameter and when the first PV leaves no enough extends, it need 2 area
It only needs 1 area - that part is correct - but it's still behaving as if
it's setting up an N-stripe extension where it pairs up the allocations with
the original stripes.
Alasdair
Alasdair G Kergon
2014-03-18 04:04:02 UTC
Permalink
Post by xiaowei
That's great you reproduced this bug:)
So the fix could be easy?
No. Once you know what's wrong, you can find several more classes
of lvextend that misbehave.

I do have a prototype fix that deals with the case you hit, but it's not
complete yet.

There are two ways you can extend an LV:
1. By pairing up newly-allocated areas of disk with existing ones
as you go along.
(A simple example would be extending each existing stripe on the same
disk.)
2. By finding possible areas of disk to fit the required
number of areas then selecting the most appropriate ones.
(The general case.)
You can also have a combination of the two, with some areas from 1
and some from 2.

The first is 'PREFERRED' in the code, and the second is 'USE_AREA'.
The path through the code that you hit tried to use the first
method when it should have used the second.
(If the number of stripes is changing there's no trivial way to
'pair up' the new areas with the existing ones.)
Minor structural changes are needed (code clean up) so that the
cling policy can sometimes use the second method.
The default 'allocation/maximise_cling' option fails.

Alasdair
xiaowei
2014-04-11 01:14:40 UTC
Permalink
Hi Alasdair,

May I ask if you have a patch to test for this now?

Thanks,
Xiaowei
Post by Alasdair G Kergon
Post by xiaowei
That's great you reproduced this bug:)
So the fix could be easy?
No. Once you know what's wrong, you can find several more classes
of lvextend that misbehave.
I do have a prototype fix that deals with the case you hit, but it's not
complete yet.
1. By pairing up newly-allocated areas of disk with existing ones
as you go along.
(A simple example would be extending each existing stripe on the same
disk.)
2. By finding possible areas of disk to fit the required
number of areas then selecting the most appropriate ones.
(The general case.)
You can also have a combination of the two, with some areas from 1
and some from 2.
The first is 'PREFERRED' in the code, and the second is 'USE_AREA'.
The path through the code that you hit tried to use the first
method when it should have used the second.
(If the number of stripes is changing there's no trivial way to
'pair up' the new areas with the existing ones.)
Minor structural changes are needed (code clean up) so that the
cling policy can sometimes use the second method.
The default 'allocation/maximise_cling' option fails.
Alasdair
Alasdair G Kergon
2014-04-15 14:43:35 UTC
Permalink
Post by xiaowei
May I ask if you have a patch to test for this now?
Try the latest upstream code and let me know of any further
problems you find.

Alasdair
xiaowei
2014-04-17 07:47:50 UTC
Permalink
Tested with latest upstream code , it fixed the seg fault.
Thanks Alasdair

Xiaowei
Post by Alasdair G Kergon
Post by xiaowei
May I ask if you have a patch to test for this now?
Try the latest upstream code and let me know of any further
problems you find.
Alasdair
Loading...