BZOJ5087 Polycomp < bitset+分块 >

Problem

Polycomp

Time  Limit:  40  Sec\mathrm{Time\;Limit:\;40\;Sec}
Memory  Limit:  256  MB\mathrm{Memory\;Limit:\;256\;MB}

Description

你有三个系数为0,10,1的多项式f(x),g(x),h(x)f(x),g(x),h(x)
f(g(x))modh(x)f(g(x))\mod{h(x)}
为方便起见,将答案多项式所有系数对22取模输出即可
如果f(x)=k=1nakxkf(x)=\sum_{k=1}^{n} a_k\cdot x^k,则f(g(x))=k=1nakg(x)kf(g(x))=\sum_{k=1}^{n}a_k\cdot g(x)^k

Input

一共三行,每行一个多项式,分别为f,g,hf,g,h
对于一个多项式PP,描述为nn个整数P0,P1,,PnP_0,P_1,\cdots,P_n,其中PiP_i0011P(x)=P0+P1x++PnxnP(x)=P_0+P_1\cdot x+\cdots+P_n\cdot x_n

Output

用同样的格式输出答案多项式
如果答案为00,输出0 0

Sample Input

1
2
3
5 0 1 0 1 0 1
2 1 1 1
4 0 1 1 0 1

Sample Output

1
1 1 1

HINT

nn表示多项式最高项的次数,n4000n\le4000

Source

By clj

标签:bitset 分块

Solution

陈老师神题。

直接用数组模拟乘除过程,可以做到O(n3)O(n^3),若用bitset维护,即可做到O(n332)O(\frac{n^3}{32})。然而这样还是会TLE\mathrm{TLE}

考虑类似分块的想法,将ff的每十位分成一段,预处理2102^{10}种情况对应的和,然后每次十位十位地加,即可做到O(n3320)O(\frac{n^3}{320}),时间复杂度可以接受。
根据vanilla的调参,貌似这种先预处理的方法在块大小为1616时最快。

OwenOwl有更快的方法,即先不预处理出所有和,最后加的时候再根据每块情况计算,这样块大小可以更大,情况数更少,会更快。

Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
#include <bits/stdc++.h>
using namespace std;
typedef bitset<8001> poly;
template <class T> inline int read(T &x) {
x = 0; int c = getchar(), f = 1;
for (; !isdigit(c); c = getchar()) if (c == 45) f = -1;
for (; isdigit(c); c = getchar()) (x *= 10) += f*(c-'0');
return x;
}
int lf, lg, lh, len, LOG[1<<16|1];
poly f, g, h, t, ans, pw[17], s[1<<16|1];
void init() {for (int i = 2; i <= (1<<16); i++) LOG[i] = LOG[i>>1]+1;}
void mod(poly &p, int l) {
for (int i = l; i >= lh; i--)
if (p[i]) p ^= (h<<(i-lh));
}
poly mul(poly a, poly b) {
poly ret = 0;
for (int i = 0; i <= lh; i++)
if (a[i]) ret ^= b<<i;
mod(ret, 2*lh); return ret;
}
int getsta(int l, int r) {
int ret = 0;
for (int i = l; i <= r; i++)
ret |= f[i]<<(i-l);
return ret;
}
int main() {
read(lf); for (int i = 0, x; i <= lf; i++) if (read(x)) f.set(i);
read(lg); for (int i = 0, x; i <= lg; i++) if (read(x)) g.set(i);
read(lh); for (int i = 0, x; i <= lh; i++) if (read(x)) h.set(i);
mod(g, lg), pw[0].set(0), s[1].set(0), t.set(0), init();
for (int i = 1; i <= 16; i++) pw[i] = mul(pw[i-1], g);
for (int i = 2; i < (1<<16); i++) s[i] = s[i-(1<<LOG[i])]^pw[LOG[i]];
for (int i = 0; i <= lf; i += 16, t = mul(t, pw[16]))
ans ^= mul(s[getsta(i, i+15)], t);
for (len = lh; !ans[len] && len; len--) ;
printf("%d ", len);
for (int i = 0; i <= len; i++)
printf("%d ", ans[i] ? 1 : 0);
return puts(""), 0;
}