20 Aug 2013 15:05

## Re: Order within a GRanges object

```>Hello,
>
>I have some points according to the internal order of granges objects.
>
>1) Automatically there is an order depending on the a) seqnames (=
>chromosomes) and b) the ranges.

no!   There is no gaurantee on the order.

> library(GenomicRanges)
> example(GRanges)
...
> longGR
GRanges with 30 ranges and 1 metadata column:
seqnames     ranges strand   |     score
<Rle>  <IRanges>  <Rle>   | <integer>
a     chr1    [1, 10]      -   |         1
b     chr2    [2, 10]      +   |         2
c     chr2    [3, 10]      +   |         3
d     chr2    [4, 10]      *   |         4
e     chr1    [5, 10]      *   |         5
...      ...        ...    ... ...       ...
chr2 [106, 115]      -   |        26
chr2 [107, 116]      -   |        27
chr3 [108, 117]      -   |        28
chr3 [109, 118]      -   |        29
chr3 [110, 119]      -   |        30
---
seqlengths:
chr1 chr2 chr3
1000 2000 1500
>  rev(longGR)
GRanges with 30 ranges and 1 metadata column:
seqnames     ranges strand   |     score
<Rle>  <IRanges>  <Rle>   | <integer>
chr3 [110, 119]      -   |        30
chr3 [109, 118]      -   |        29
chr3 [108, 117]      -   |        28
chr2 [107, 116]      -   |        27
chr2 [106, 115]      -   |        26
...      ...        ...    ... ...       ...
e     chr1    [5, 10]      *   |         5
d     chr2    [4, 10]      *   |         4
c     chr2    [3, 10]      +   |         3
b     chr2    [2, 10]      +   |         2
a     chr1    [1, 10]      -   |         1
---
seqlengths:
chr1 chr2 chr3
1000 2000 1500
>

>
>2) The seqnames are always sorted in ascii order.

No!  but they _can_ be:

> sort(longGR)
GRanges with 30 ranges and 1 metadata column:
seqnames     ranges strand   |     score
<Rle>  <IRanges>  <Rle>   | <integer>
f     chr1    [6, 10]      +   |         6
chr1    [1,  5]      -   |       101
a     chr1    [1, 10]      -   |         1
chr1    [2,  6]      -   |       102
chr1    [3,  7]      -   |       103
...      ...        ...    ... ...       ...
j     chr3 [ 10,  10]      -   |        10
chr3 [ 10,  14]      -   |       110
chr3 [108, 117]      -   |        28
chr3 [109, 118]      -   |        29
chr3 [110, 119]      -   |        30
---
seqlengths:
chr1 chr2 chr3
1000 2000 1500

~ Malcolm Cook

>
>3) After
>    df <- as.data.frame
>    m <- regexpr ("\\d+", df\$seqnames, perl=TRUE)
>    df\$Chromosome <- regmatches (df\$seqnames, m)
>    df\$Chromosome <- as.integer (as.character (df\$Chromosome))
>    df <- df [order(df\$Chromosome),]
>    only the order of the chromosomes is changed. The order of the ranges
>(now df\$start and df\$end) is still the same.
>
>Are my assumptions true?
>
>Thanks Hermann
>
>
```

