【算法】区间神器——莫队

背景

莫队算法是由清华大学莫涛提出的,又称(MO’s Algorithm)。
可用于处理一切离线区间问题,立个flag

介绍

莫队基于分块,有关分块请点击,其实hzwer.com上面有更详细的讲解,其实我是先学了莫队,才搞懂分块的233,只要记住分块的思想就好了,大段标记,边缘暴力。

以下翻译$+$自己的理解于Anudeep’s blog(这位好像是个印度程序员)。顺便给出原文吧,总觉得自己翻译的烂。

1)提出问题;State a problem
2)介绍一个简单的$O(N^{2})$方算法;Explain a simple solution which takes$O(N^{2})$
3)小小的改变上述算法,不过仍然是一个$O(N^{2})$方算法;Slight modification to above algorithm.It still runs in $O(N^{2})$
4)介绍一个算法去解决上述问题,并且声明其正确性;Explain an algorithm to solve above problem and state its correctness
5)证明上述算法复杂度为$O(\sqrt N \times N)$;Proof for complexity of above algorithm – $O(\sqrt N \times N)$
6)我们什么时候使用该算法;Explain where and when we can use above algorithm
7)练习;Problems for practice and sample code

State a problem

给出$N$个数组成的序列,你需要回答$M$个问题,每个问题询问区间$[l,r]$,你需要回答区间内有多少个数至少出现过$3$次或以上。
例如:$Input:$

10
1 2 3 1 1 2 1 2 3 1
2
1 5
2 9

$Output:$

1//only 1 is repeated at least 3 times
2//1 and 2

Explain a simple solution which takesO(N^{2})

For each query,loop from $L$ to $R$,count the frequency of elements ans report the answer.Considering $M = N$, following runs in $O(N^{2})$ in worst case;
(很好理解就不翻了)

foreach query
    ans = 0
    cnt[] = 0
    for i = l to r
        cnt[A[i]]++
        if cnt[A[i]] == 3
            ans++

Slight modification to above algorithm. It still runs in O(N^{2})

add(pos)
    cnt[A[pos]]++
    if cnt[A[pos]] == 3
        ans++
sub(pos)
    cnt[A[pos]]--
    if cnt[A[pos]] == 2
        ans--
curL = curR = ans = cnt[] = 0
foreach query:
    //curL sould go to L,curR should go to R
    while(curL < L) sub(curL++)
    while(curL > L) add(curL--)
    while(curR < R) add(curR++)
    while(curR > R) sub(curR--)
    output ans

最初我们总是从L到R循环,但是现在我们改变之前查询的位置以适应当前查询。add()表示我们正在添加pos上的元素到当前集合中,并相应的更新答案。sub()表示删除。

Explain an algorithm to solve above problem and state its correctness

莫队就是一个这样的算法,只不过处理查询的顺序不同而已。我们会获得$M$组询问,我们将按特点的顺序重新排序查询,然后再处理。
显然这是一种离线算法,我们对其分块,求导可知(均值不等式也可推出),分成$\sqrt N$块最优,如果查询的左端点在第$P$块中,则查询第$P$块,我们对询问排过序,就有可能有许多同属于一个块的查询。
假设有如下询问:

{0,1}{1,7}{2,8}{7,8}{4,8}{4,4}{1,2}

按照块序号排序

{0,3}{1,7}{2,8}{1,2}{4,8}{4,4}{7,8}

按照$R$值升序排序

{1,2}{0,3}{1,7}{2,8}{4,4}{4,8}{7,8}

然后我们以上一节的程序来解决问题,即可在$O(\sqrt N \times N)$内解决

Proof for complexity of above algorithm – O(Sqrt(N) * N)

证明的话其实这里的英文写的很容易懂的,我懒得翻了
We are done with MO’s algorithm, it is just an ordering. Awesome part is its runtime analysis. It turns out that the O(N^2) code we wrote works in O(Sqrt(N) * N) time if we follow the order i specified above. Thats awesome right, with just reordering the queries we reduced the complexity from O(N^2) to O(Sqrt(N) * N), and that too with out any further modification to code. Hurray! we will get AC with O(Sqrt(N) * N).

Have a look at our code above, the complexity over all queries is determined by the 4 while loops. First 2 while loops can be stated as “Amount moved by left pointer in total”, second 2 while loops can be stated as “Amount moved by right pointer”. Sum of these two will be the over all complexity.

Most important. Let us talk about the right pointer first. For each block, the queries are sorted in increasing order, so clearly the right pointer (currentR) moves in increasing order. During the start of next block the pointer possibly at extreme end will move to least R in next block. That means for a given block, the amount moved by right pointer is O(N). We have O(Sqrt(N)) blocks, so the total is O(N * Sqrt(N)). Great!

Let us see how the left pointer moves. For each block, the left pointer of all the queries fall in the same block, as we move from query to query the left pointer might move but as previous L and current L fall in the same block, the moment is O(Sqrt(N)) (Size of the block). In each block the amount left pointer movies is O(Q * Sqrt(N)) where Q is number of queries falling in that block. Total complexity is O(M * Sqrt(N)) for all blocks.

There you go, total complexity is O( (N + M) * Sqrt(N)) = O( N * Sqrt(N))

Explain where and when we can use above algorithm

①强制在线不可莫队
②修改序列的话需要带修莫队
③能够编写sub()函数

例题

A.SPOJ DQUERY – D-query Luogu-spoj p3267 区间不同数个数
B.Codeforces 86 D CF86D 求区间内每种数字出现次数的平方$\times$数字的值 的和
C.CodeChef GERALD07 //smg 这OJ,,,
D.Codeforces 375 D CF375D 树上莫队
E.CodeChef IITI15