Post on 04-Aug-2020
Mutually Uncorrelated Codes for DNA Storage
Maya Levy and Eitan Yaakobi
Technion - Israel Institute of TechnologyCoding Seminar 1
Outline
• Motivation
• Mutually Uncorrelated Codes• Well- known construction analysis
• Non fixed Run Length Limited constraint
• Efficient construction
• 𝒅𝒉, 𝒅𝒎 − Mutually Uncorrelated Codes• Upper bound on cardinality
• Efficient construction
• Ongoing and future work
2
DNA Storage
3
DNA Storage
• G. M. Church, Y. Gao, and S. Kosuri, “Next-generation digital information storage in DNA,” Science, vol. 337, no. 6102, pp. 1628–1628, Sep. 2012• N. Goldman, P. Bertone, S. Chen, C. Dessimoz, E. M. LeProust, B. Sipos, and E. Birney, “Towards practical, high-capacity, low maintenance information storage in
synthesized DNA,” Nature, vol. 494, no. 7435, Feb. 2013• R. N. Grass, R. Heckel, M. Puddu, D. Paunescu, and W. J. Stark, “Robust chemical preservation of digital information on DNA in silica with error correcting
codes," Angewandte Chemie International Edition, vol. 54, no. 8, pp. 2552-2555, Feb. 2015• M. Blawat, K. Gaedke,I. Huetter, X. M. Chen, B. Turczyk, S. Inverso, B. W. Pruitt, and G. M. Church, “Forward error correction for DNA data storage,” Procedia
Computer Science, vol. 80, pp. 1011-1022, 2016• Y. Erlich and D. Zielinski “DNA Fountain enables a robust and efficient storage architecture,” Science, vol. 355, Issue 6328, pp. 950-954, Mar. 2017 4
DNA Storage
• G. M. Church, Y. Gao, and S. Kosuri, “Next-generation digital information storage in DNA,” Science, vol. 337, no. 6102, pp. 1628–1628, Sep. 2012• N. Goldman, P. Bertone, S. Chen, C. Dessimoz, E. M. LeProust, B. Sipos, and E. Birney, “Towards practical, high-capacity, low maintenance information storage in
synthesized DNA,” Nature, vol. 494, no. 7435, Feb. 2013• R. N. Grass, R. Heckel, M. Puddu, D. Paunescu, and W. J. Stark, “Robust chemical preservation of digital information on DNA in silica with error correcting
codes," Angewandte Chemie International Edition, vol. 54, no. 8, pp. 2552-2555, Feb. 2015• M. Blawat, K. Gaedke,I. Huetter, X. M. Chen, B. Turczyk, S. Inverso, B. W. Pruitt, and G. M. Church, “Forward error correction for DNA data storage,” Procedia
Computer Science, vol. 80, pp. 1011-1022, 2016• Y. Erlich and D. Zielinski “DNA Fountain enables a robust and efficient storage architecture,” Science, vol. 355, Issue 6328, pp. 950-954, Mar. 2017 5
DNA Storage
GCCTCAAAGTTACACCGTGCATTT
…ACGTAC
• G. M. Church, Y. Gao, and S. Kosuri, “Next-generation digital information storage in DNA,” Science, vol. 337, no. 6102, pp. 1628–1628, Sep. 2012• N. Goldman, P. Bertone, S. Chen, C. Dessimoz, E. M. LeProust, B. Sipos, and E. Birney, “Towards practical, high-capacity, low maintenance information storage in
synthesized DNA,” Nature, vol. 494, no. 7435, Feb. 2013• R. N. Grass, R. Heckel, M. Puddu, D. Paunescu, and W. J. Stark, “Robust chemical preservation of digital information on DNA in silica with error correcting
codes," Angewandte Chemie International Edition, vol. 54, no. 8, pp. 2552-2555, Feb. 2015• M. Blawat, K. Gaedke,I. Huetter, X. M. Chen, B. Turczyk, S. Inverso, B. W. Pruitt, and G. M. Church, “Forward error correction for DNA data storage,” Procedia
Computer Science, vol. 80, pp. 1011-1022, 2016• Y. Erlich and D. Zielinski “DNA Fountain enables a robust and efficient storage architecture,” Science, vol. 355, Issue 6328, pp. 950-954, Mar. 2017 6
DNA Storage
GCCTCAAAGTTACACCGTGCATTT
…ACGTAC
• G. M. Church, Y. Gao, and S. Kosuri, “Next-generation digital information storage in DNA,” Science, vol. 337, no. 6102, pp. 1628–1628, Sep. 2012• N. Goldman, P. Bertone, S. Chen, C. Dessimoz, E. M. LeProust, B. Sipos, and E. Birney, “Towards practical, high-capacity, low maintenance information storage in
synthesized DNA,” Nature, vol. 494, no. 7435, Feb. 2013• R. N. Grass, R. Heckel, M. Puddu, D. Paunescu, and W. J. Stark, “Robust chemical preservation of digital information on DNA in silica with error correcting
codes," Angewandte Chemie International Edition, vol. 54, no. 8, pp. 2552-2555, Feb. 2015• M. Blawat, K. Gaedke,I. Huetter, X. M. Chen, B. Turczyk, S. Inverso, B. W. Pruitt, and G. M. Church, “Forward error correction for DNA data storage,” Procedia
Computer Science, vol. 80, pp. 1011-1022, 2016• Y. Erlich and D. Zielinski “DNA Fountain enables a robust and efficient storage architecture,” Science, vol. 355, Issue 6328, pp. 950-954, Mar. 2017 7
DNA Storage
GCCTCAAAGTTACACCGTGCATTT
…ACGTAC
• G. M. Church, Y. Gao, and S. Kosuri, “Next-generation digital information storage in DNA,” Science, vol. 337, no. 6102, pp. 1628–1628, Sep. 2012• N. Goldman, P. Bertone, S. Chen, C. Dessimoz, E. M. LeProust, B. Sipos, and E. Birney, “Towards practical, high-capacity, low maintenance information storage in
synthesized DNA,” Nature, vol. 494, no. 7435, Feb. 2013• R. N. Grass, R. Heckel, M. Puddu, D. Paunescu, and W. J. Stark, “Robust chemical preservation of digital information on DNA in silica with error correcting
codes," Angewandte Chemie International Edition, vol. 54, no. 8, pp. 2552-2555, Feb. 2015• M. Blawat, K. Gaedke,I. Huetter, X. M. Chen, B. Turczyk, S. Inverso, B. W. Pruitt, and G. M. Church, “Forward error correction for DNA data storage,” Procedia
Computer Science, vol. 80, pp. 1011-1022, 2016• Y. Erlich and D. Zielinski “DNA Fountain enables a robust and efficient storage architecture,” Science, vol. 355, Issue 6328, pp. 950-954, Mar. 2017 8
DNA Storage
GCCTCAAAGTTACACCGTGCATTT
…ACGTAC
• G. M. Church, Y. Gao, and S. Kosuri, “Next-generation digital information storage in DNA,” Science, vol. 337, no. 6102, pp. 1628–1628, Sep. 2012• N. Goldman, P. Bertone, S. Chen, C. Dessimoz, E. M. LeProust, B. Sipos, and E. Birney, “Towards practical, high-capacity, low maintenance information storage in
synthesized DNA,” Nature, vol. 494, no. 7435, Feb. 2013• R. N. Grass, R. Heckel, M. Puddu, D. Paunescu, and W. J. Stark, “Robust chemical preservation of digital information on DNA in silica with error correcting
codes," Angewandte Chemie International Edition, vol. 54, no. 8, pp. 2552-2555, Feb. 2015• M. Blawat, K. Gaedke,I. Huetter, X. M. Chen, B. Turczyk, S. Inverso, B. W. Pruitt, and G. M. Church, “Forward error correction for DNA data storage,” Procedia
Computer Science, vol. 80, pp. 1011-1022, 2016• Y. Erlich and D. Zielinski “DNA Fountain enables a robust and efficient storage architecture,” Science, vol. 355, Issue 6328, pp. 950-954, Mar. 2017 9
Random Access
• S. M. H. T. Yazdi, Y. Yuan, J. Ma, H. Zhao, and O. Milenkovic, “A rewritable, random-access DNA-based storage system,” Nature Scientific Reports, vol. 5, no. 14138, Aug. 2015
• J. Bornholt, R. Lopez, D. M. Carmean, L. Ceze, G. Seelig, and K. Strauss, “A DNA-based archival storage system,” ASPLOS, pp. 637–649, Atlanta, GA, Apr. 2016
10
Random Access
• S. M. H. T. Yazdi, Y. Yuan, J. Ma, H. Zhao, and O. Milenkovic, “A rewritable, random-access DNA-based storage system,” Nature Scientific Reports, vol. 5, no. 14138, Aug. 2015
• J. Bornholt, R. Lopez, D. M. Carmean, L. Ceze, G. Seelig, and K. Strauss, “A DNA-based archival storage system,” ASPLOS, pp. 637–649, Atlanta, GA, Apr. 2016
11
Addresses Set Constraints
• Mutually uncorrelatedness of sequences• Large minimum Hamming distance
S. M. H. T. Yazdi, Y. Yuan, J. Ma, H. Zhao, and O. Milenkovic, “A rewritable, random-access DNA-based storage system,” Nature Scientific Reports, vol. 5, no. 14138, Aug. 2015 12
Mutually Uncorrelated Codes
Two not necessarily distinct words 𝒂, 𝒃 ∈ 𝔽𝒒𝒏 are mutually uncorrelated if any non-
trivial prefix of 𝒂 does not match a non-trivial suffix of 𝒃 and vice versa
13
Mutually Uncorrelated Codes
0 0 0 1 0
1 1 0 0 1
𝒂 =
𝒃 =
Two not necessarily distinct words 𝒂, 𝒃 ∈ 𝔽𝒒𝒏 are mutually uncorrelated if any non-
trivial prefix of 𝒂 does not match a non-trivial suffix of 𝒃 and vice versa
14
𝒂 =
𝒃 =
Mutually Uncorrelated Codes
0 0 0 1 0
1 1 0 0 1
Two not necessarily distinct words 𝒂, 𝒃 ∈ 𝔽𝒒𝒏 are mutually uncorrelated if any non-
trivial prefix of 𝒂 does not match a non-trivial suffix of 𝒃 and vice versa
15
𝒂 =
𝒃 =
Mutually Uncorrelated Codes
0 0 0 1 0
1 1 0 0 1
Two not necessarily distinct words 𝒂, 𝒃 ∈ 𝔽𝒒𝒏 are mutually uncorrelated if any non-
trivial prefix of 𝒂 does not match a non-trivial suffix of 𝒃 and vice versa
16
𝒂 =
𝒃 =
Mutually Uncorrelated Codes
0 0 0 1 0
1 1 0 0 1
Two not necessarily distinct words 𝒂, 𝒃 ∈ 𝔽𝒒𝒏 are mutually uncorrelated if any non-
trivial prefix of 𝒂 does not match a non-trivial suffix of 𝒃 and vice versa
17
𝒂 =
𝒃 =
Mutually Uncorrelated Codes
0 0 0 1 0
1 1 0 0 1
Two not necessarily distinct words 𝒂, 𝒃 ∈ 𝔽𝒒𝒏 are mutually uncorrelated if any non-
trivial prefix of 𝒂 does not match a non-trivial suffix of 𝒃 and vice versa
18
Mutually Uncorrelated Codes
0 0 0 1 0
1 1 0 0 1
𝒂 =
𝒃 =
Two not necessarily distinct words 𝒂, 𝒃 ∈ 𝔽𝒒𝒏 are mutually uncorrelated if any non-
trivial prefix of 𝒂 does not match a non-trivial suffix of 𝒃 and vice versa
19
𝒂 =
𝒃 =
Mutually Uncorrelated Codes
0 0 0 1 0
1 1 0 0 1
Two not necessarily distinct words 𝒂, 𝒃 ∈ 𝔽𝒒𝒏 are mutually uncorrelated if any non-
trivial prefix of 𝒂 does not match a non-trivial suffix of 𝒃 and vice versa
20
𝒂 =
𝒃 =
Mutually Uncorrelated Codes
0 0 0 1 0
1 1 0 0 1
Two not necessarily distinct words 𝒂, 𝒃 ∈ 𝔽𝒒𝒏 are mutually uncorrelated if any non-
trivial prefix of 𝒂 does not match a non-trivial suffix of 𝒃 and vice versa
21
𝒂 =
𝒃 =
Mutually Uncorrelated Codes
0 0 0 1 0
1 1 0 0 1
Two not necessarily distinct words 𝒂, 𝒃 ∈ 𝔽𝒒𝒏 are mutually uncorrelated if any non-
trivial prefix of 𝒂 does not match a non-trivial suffix of 𝒃 and vice versa
22
𝒂 =
𝒃 =
Mutually Uncorrelated Codes
0 0 0 1 0
1 1 0 0 1
Two not necessarily distinct words 𝒂, 𝒃 ∈ 𝔽𝒒𝒏 are mutually uncorrelated if any non-
trivial prefix of 𝒂 does not match a non-trivial suffix of 𝒃 and vice versa
23
Mutually Uncorrelated Codes
0 0 0 1 0
1 1 0 0 1
𝒂 =
𝒃 =
Two not necessarily distinct words 𝒂, 𝒃 ∈ 𝔽𝒒𝒏 are mutually uncorrelated if any non-
trivial prefix of 𝒂 does not match a non-trivial suffix of 𝒃 and vice versa
24
Mutually Uncorrelated Codes
A code 𝑪 ⊆ 𝔽𝒒𝒏 is a mutually uncorrelated (MU) code if any two not necessarily
distinct codewords of 𝑪 are mutually uncorrelated.
u1 u2 u3 u4 un
v1 v2 v3 v4 vn
Two not necessarily distinct words 𝒂, 𝒃 ∈ 𝔽𝒒𝒏 are mutually uncorrelated if any non-
trivial prefix of 𝒂 does not match a non-trivial suffix of 𝒃 and vice versa
25
Construction of MU Codes
• MU codes were studied from the 60’s for synchronization purposes
• We present results over 𝔽𝟐𝒏. Most of the results can be extended to
larger fields
• The following MU code construction is the best construction known in terms of redundancy, introduced by Gilbert 1960, Levenshtein 1964, and Chee et al. 2013
26
Construction of MU Codes
• MU codes were studied from the 60’s for synchronization purposes
• We present results over 𝔽𝟐𝒏. Most of the results can be extended to
larger fields
• The following MU code construction is the best construction known in terms of redundancy, introduced by Gilbert 1960, Levenshtein 1964, and Chee et al. 2013
27
Construction of MU Codes
• MU codes were studied from the 60’s for synchronization purposes
• We present results over 𝔽𝟐𝒏. Most of the results can be extended to
larger fields
• The following MU code construction is the best construction known in terms of redundancy, introduced by Gilbert 1960, Levenshtein 1964, and Chee et al. 2013
0 0 0 0 0 0 0 0
𝒌 𝒛𝒆𝒓𝒐𝒔
28
Construction of MU Codes
• MU codes were studied from the 60’s for synchronization purposes
• We present results over 𝔽𝟐𝒏. Most of the results can be extended to
larger fields
• The following MU code construction is the best construction known in terms of redundancy, introduced by Gilbert 1960, Levenshtein 1964, and Chee et al. 2013
0 0 0 0 0 0 0 0 1 1
𝒌 𝒛𝒆𝒓𝒐𝒔
29
Construction of MU Codes
• MU codes were studied from the 60’s for synchronization purposes
• We present results over 𝔽𝟐𝒏. Most of the results can be extended to
larger fields
• The following MU code construction is the best construction known in terms of redundancy, introduced by Gilbert 1960, Levenshtein 1964, and Chee et al. 2013
0 0 0 0 0 0 0 0 1 1
𝒌 𝒛𝒆𝒓𝒐𝒔 𝒛𝒆𝒓𝒐𝒔 𝒓𝒖𝒏 < 𝒌
30
0 0 0 0 0 0 0 0 1 1
0 0 0 0 0 0 0 0 1 1
Construction of MU Codes
31
0 0 0 0 0 0 0 0 1 1
0 0 0 0 0 0 0 0 1 1
Construction of MU Codes
32
0 0 0 0 0 0 0 0 1 1
0 0 0 0 0 0 0 0 1 1
Construction of MU Codes
33
0 0 0 0 0 0 0 0 1 1
0 0 0 0 0 0 0 0 1 1
Construction of MU Codes
34
0 0 0 0 0 0 0 0 1 1
0 0 0 0 0 0 0 0 1
Construction of MU Codes
35
0 0 0 0 0 0 0 0 1 1
0 0 0 0 0 0 0 0 1
Construction of MU Codes
36
0 0 0 0 0 0 0 0 1 1
0 0 0 0 0 0 0 0 1
Construction of MU Codes
𝑧𝑒𝑟𝑜𝑠 𝑟𝑢𝑛 < 𝑘
37
0 0 0 0 0 0 0 0 1 1
0 0 0 0 0 0 0 0 1
Construction of MU Codes
𝑧𝑒𝑟𝑜𝑠 𝑟𝑢𝑛 < 𝑘
38
Construction of MU Codes
0 0 0 0 0 0 0 0 1 1
𝑘 𝑧𝑒𝑟𝑜𝑠 𝑧𝑒𝑟𝑜𝑠 𝑟𝑢𝑛 < 𝑘
For every 𝒏 and 𝒌, we have a code 𝑪(𝒏, 𝒌)Problem: for a given 𝒏, what is 𝐦𝐚𝐱
𝒌|𝑪 𝒏,𝒌 |?
Previous results: for 𝒏 = 𝟐𝒊 , 𝐦𝐚𝐱𝒌
𝑪 𝒏, 𝒌 ≳𝟐𝒏
𝟐𝒆𝒏
39
Construction of MU Codes
0 0 0 0 0 0 0 0 1 1
𝑘 𝑧𝑒𝑟𝑜𝑠 𝑧𝑒𝑟𝑜𝑠 𝑟𝑢𝑛 < 𝑘
For every 𝒏 and 𝒌, we have a code 𝑪(𝒏, 𝒌)Problem: for a given 𝒏, what is 𝐦𝐚𝐱
𝒌|𝑪 𝒏,𝒌 |?
Previous results: for 𝒏 = 𝟐𝒊 , 𝐦𝐚𝐱𝒌
𝑪 𝒏, 𝒌 ≳𝟐𝒏
𝟐𝒆𝒏
𝒇 𝒏 ≈ 𝒈 𝒏 if lim𝒏→∞
𝒇 𝒏
𝒈 𝒏= 1
𝒇 𝒏 ≳ 𝒈 𝒏 if lim𝒏→∞
𝒇 𝒏
𝒈 𝒏≥1
40
Construction of MU Codes
0 0 0 0 0 0 0 0 1 1
𝑘 𝑧𝑒𝑟𝑜𝑠 𝑧𝑒𝑟𝑜𝑠 𝑟𝑢𝑛 < 𝑘
For every 𝒏 and 𝒌, we have a code 𝑪(𝒏, 𝒌)Problem: for a given 𝒏, what is 𝐦𝐚𝐱
𝒌|𝑪 𝒏,𝒌 |?
Previous results: for 𝒏 = 𝟐𝒊 , 𝐦𝐚𝐱𝒌
𝑪 𝒏, 𝒌 ≳𝟐𝒏
𝟐𝒆𝒏
𝒇 𝒏 ≈ 𝒈 𝒏 if lim𝒏→∞
𝒇 𝒏
𝒈 𝒏= 1
𝒇 𝒏 ≳ 𝒈 𝒏 if lim𝒏→∞
𝒇 𝒏
𝒈 𝒏≥1
41
Construction of MU Codes
0 0 0 0 0 0 0 0 1 1
𝑘 𝑧𝑒𝑟𝑜𝑠 𝑧𝑒𝑟𝑜𝑠 𝑟𝑢𝑛 < 𝑘
For every 𝒏 and 𝒌, we have a code 𝑪(𝒏, 𝒌)Problem: for a given 𝒏, what is 𝐦𝐚𝐱
𝒌|𝑪 𝒏,𝒌 |?
Previous results: for 𝒏 = 𝟐𝒊 , 𝐦𝐚𝐱𝒌
𝑪 𝒏, 𝒌 ≳𝟐𝒏
𝟐𝒆𝒏
Our contribution:
𝒇 𝒏 ≈ 𝒈 𝒏 if lim𝒏→∞
𝒇 𝒏
𝒈 𝒏= 1
𝒇 𝒏 ≳ 𝒈 𝒏 if lim𝒏→∞
𝒇 𝒏
𝒈 𝒏≥1
42
Construction of MU Codes
0 0 0 0 0 0 0 0 1 1
𝑘 𝑧𝑒𝑟𝑜𝑠 𝑧𝑒𝑟𝑜𝑠 𝑟𝑢𝑛 < 𝑘
For every 𝒏 and 𝒌, we have a code 𝑪(𝒏, 𝒌)Problem: for a given 𝒏, what is 𝐦𝐚𝐱
𝒌|𝑪 𝒏,𝒌 |?
Previous results: for 𝒏 = 𝟐𝒊 , 𝐦𝐚𝐱𝒌
𝑪 𝒏, 𝒌 ≳𝟐𝒏
𝟐𝒆𝒏
Our contribution:
(*) 𝐦𝐚𝐱𝒌
|𝑪 𝒏,𝒌 | ≈𝟐𝒏
𝒏𝟐𝑭 𝒏 ≤
𝟐𝒏
𝟐𝒆𝒏𝐹 𝑛 = Δ𝑛 −min 2Δ𝑛 log𝑒 + 1, 2Δ𝑛+1 log𝑒 , Δ𝑛= log𝑛 − ⌈log𝑛⌉
for𝒏 = 𝟐𝒊, 𝐦𝐚𝐱𝒌
|𝑪 𝒏,𝒌 | ≈𝟐𝒏
𝟐𝒆𝒏
𝒇 𝒏 ≈ 𝒈 𝒏 if lim𝒏→∞
𝒇 𝒏
𝒈 𝒏= 1
𝒇 𝒏 ≳ 𝒈 𝒏 if lim𝒏→∞
𝒇 𝒏
𝒈 𝒏≥1
43
Construction of MU Codes
0 0 0 0 0 0 0 0 1 1
𝑘 𝑧𝑒𝑟𝑜𝑠 𝑧𝑒𝑟𝑜𝑠 𝑟𝑢𝑛 < 𝑘
For every 𝒏 and 𝒌, we have a code 𝑪(𝒏, 𝒌)Problem: for a given 𝒏, what is 𝐦𝐚𝐱
𝒌|𝑪 𝒏,𝒌 |?
Previous results: for 𝒏 = 𝟐𝒊 , 𝐦𝐚𝐱𝒌
𝑪 𝒏, 𝒌 ≳𝟐𝒏
𝟐𝒆𝒏
Our contribution:
(*) 𝐦𝐚𝐱𝒌
|𝑪 𝒏,𝒌 | ≈𝟐𝒏
𝒏𝟐𝑭 𝚫𝐧 ≤
𝟐𝒏
𝟐𝒆𝒏
Δ𝑛= log 𝑛 − ⌈log 𝑛⌉ , 𝐹 Δ = Δ − min 2Δ log 𝑒 + 1, 2Δ+1 log 𝑒
for𝒏 = 𝟐𝒊, 𝐦𝐚𝐱𝒌
|𝑪 𝒏,𝒌 | ≈𝟐𝒏
𝟐𝒆𝒏
(*) Thanks to Ron Roth for his contribution to this proof
𝒇 𝒏 ≈ 𝒈 𝒏 if lim𝒏→∞
𝒇 𝒏
𝒈 𝒏= 1
𝒇 𝒏 ≳ 𝒈 𝒏 if lim𝒏→∞
𝒇 𝒏
𝒈 𝒏≥1
44
2𝑛
2𝑒𝑛
Cardinality
Construction cardinality, 𝑛 = 2𝑖
Mutually Uncorrelated Codes
0 0 0 0 0 0 0 0 1 1
𝒛𝒆𝒓𝒐𝒔 𝒓𝒖𝒏 < 𝒌𝒌 𝒛𝒆𝒓𝒐𝒔
45
Mutually Uncorrelated Codes
2𝑛
2𝑒𝑛
Cardinality
Upper bound by Levenshtein , ’70
2𝑛
𝑒𝑛Construction
cardinality, 𝑛 = 2𝑖
0 0 0 0 0 0 0 0 1 1
𝒛𝒆𝒓𝒐𝒔 𝒓𝒖𝒏 < 𝒌𝒌 𝒛𝒆𝒓𝒐𝒔
46
Mutually Uncorrelated Codes
2𝑛
2𝑒𝑛
Cardinality
Upper bound by Levenshtein , ’70
2𝑛
𝑒𝑛Construction
cardinality, 𝑛 = 2𝑖
0 0 0 0 0 0 0 0 1 1
𝒛𝒆𝒓𝒐𝒔 𝒓𝒖𝒏 < 𝒌𝒌 𝒛𝒆𝒓𝒐𝒔
47
Mutually Uncorrelated Codes
2𝑛
2𝑒𝑛
Cardinality
Upper bound by Levenshtein , ’70
2𝑛
𝑒𝑛Construction
cardinality, 𝑛 = 2𝑖
log 𝑒 + log 𝑛 + 1 log 𝑒 + log 𝑛Redundancy
0 0 0 0 0 0 0 0 1 1
𝒛𝒆𝒓𝒐𝒔 𝒓𝒖𝒏 < 𝒌𝒌 𝒛𝒆𝒓𝒐𝒔
48
Mutually Uncorrelated Codes
2𝑛
2𝑒𝑛
Cardinality
Upper bound by Levenshtein , ’70
2𝑛
𝑒𝑛Construction
cardinality, 𝑛 = 2𝑖
log 𝑒 + log 𝑛 + 1 log 𝑒 + log 𝑛Redundancy
0 0 0 0 0 0 0 0 1 1
𝒛𝒆𝒓𝒐𝒔 𝒓𝒖𝒏 < 𝒌𝒌 𝒛𝒆𝒓𝒐𝒔
49
Mutually Uncorrelated Codes
2𝑛
2𝑒𝑛
Cardinality
Upper bound by Levenshtein , ’70
2𝑛
𝑒𝑛Construction
cardinality, 𝑛 = 2𝑖
log 𝑒 + log 𝑛 + 1 log 𝑒 + log 𝑛Redundancy
Efficient construction?
0 0 0 0 0 0 0 0 1 1
𝒛𝒆𝒓𝒐𝒔 𝒓𝒖𝒏 < 𝒌𝒌 𝒛𝒆𝒓𝒐𝒔
50
Mutually Uncorrelated Codes
2𝑛
2𝑒𝑛
Cardinality
Upper bound by Levenshtein , ’70
2𝑛
𝑒𝑛Construction
cardinality, 𝑛 = 2𝑖
log 𝑒 + log 𝑛 + 1 log 𝑒 + log 𝑛Redundancy
0 0 0 0 0 0 0 0 1 1
⌈𝐥𝐨𝐠𝒏⌉ + 𝟏 𝒛𝒆𝒓𝒐𝒔 𝒛𝒆𝒓𝒐𝒔 𝒓𝒖𝒏 < ⌈𝐥𝐨𝐠𝒏⌉ + 𝟏
Efficient construction?
51
Zeros Run Length Limited Encoding
0 1 1 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 0 1
Problem: encode a length-𝑛 vector to 𝑛 + 1 bit s.t. every zeros run < ⌈log 𝑛⌉ + 1Example: 𝑛 = 30, ⌈log 𝑛⌉ + 1 = 6
52
Zeros Run Length Limited Encoding
0 1 1 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 0 1
0 1 1 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 0 1 1
Problem: encode a length-𝑛 vector to 𝑛 + 1 bit s.t. every zeros run < ⌈log 𝑛⌉ + 1Example: 𝑛 = 30, ⌈log 𝑛⌉ + 1 = 6
53
Zeros Run Length Limited Encoding
0 1 1 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 0 1
0 1 1 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 0 1 1
𝒊 = 𝟏
Problem: encode a length-𝑛 vector to 𝑛 + 1 bit s.t. every zeros run < ⌈log 𝑛⌉ + 1Example: 𝑛 = 30, ⌈log 𝑛⌉ + 1 = 6
54
Zeros Run Length Limited Encoding
0 1 1 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 0 1
0 1 1 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 0 1 1
𝒊 = 𝟐
Problem: encode a length-𝑛 vector to 𝑛 + 1 bit s.t. every zeros run < ⌈log 𝑛⌉ + 1Example: 𝑛 = 30, ⌈log 𝑛⌉ + 1 = 6
55
Zeros Run Length Limited Encoding
0 1 1 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 0 1
0 1 1 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 0 1 1
𝒊 = 𝟒
Problem: encode a length-𝑛 vector to 𝑛 + 1 bit s.t. every zeros run < ⌈log 𝑛⌉ + 1Example: 𝑛 = 30, ⌈log 𝑛⌉ + 1 = 6
56
Zeros Run Length Limited Encoding
0 1 1 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 0 1
0 1 1 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 0 1 1
𝒊 = 𝟒
0 1 1 0 1 1 0 0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 0 1 1
Problem: encode a length-𝑛 vector to 𝑛 + 1 bit s.t. every zeros run < ⌈log 𝑛⌉ + 1Example: 𝑛 = 30, ⌈log 𝑛⌉ + 1 = 6
57
Zeros Run Length Limited Encoding
0 1 1 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 0 1
0 1 1 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 0 1 1
𝒊 = 𝟒
0 1 1 0 1 1 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 1 10 1 1 0 1 1 0 0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 0 1 1 0 0 1 0 0
Problem: encode a length-𝑛 vector to 𝑛 + 1 bit s.t. every zeros run < ⌈log 𝑛⌉ + 1Example: 𝑛 = 30, ⌈log 𝑛⌉ + 1 = 6
58
Zeros Run Length Limited Encoding
0 1 1 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 0 1
0 1 1 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 0 1 1
𝒊 = 𝟒
0 1 1 0 1 1 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 1 10 1 1 0 1 1 0 0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 0 1 1 0 0 1 0 0 0
Problem: encode a length-𝑛 vector to 𝑛 + 1 bit s.t. every zeros run < ⌈log 𝑛⌉ + 1Example: 𝑛 = 30, ⌈log 𝑛⌉ + 1 = 6
59
Zeros Run Length Limited Encoding
0 1 1 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 0 1
0 1 1 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 0 1 1
0 1 1 0 1 1 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 1 10 1 1 0 1 1 0 0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 0 1 1 0 0 1 0 0 0
𝒊 = 𝟒
Problem: encode a length-𝑛 vector to 𝑛 + 1 bit s.t. every zeros run < ⌈log 𝑛⌉ + 1Example: 𝑛 = 30, ⌈log 𝑛⌉ + 1 = 6
60
Zeros Run Length Limited Encoding
0 1 1 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 0 1
0 1 1 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 0 1 1
Problem: encode a length-𝑛 vector to 𝑛 + 1 bit s.t. every zeros run < ⌈log 𝑛⌉ + 1Example: 𝑛 = 30, ⌈log 𝑛⌉ + 1 = 6
0 1 1 0 1 1 0 1 1 0 1 1 1 0 0 1 0 0 0 0 0 1 1 1 0 0 1 1 0 0 0
0 1 1 0 1 1 0 0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 0 1 1 0 0 1 0 0 0
61
Zeros Run Length Limited Encoding
0 1 1 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 0 1
0 1 1 0 1 1 0 1 1 0 1 1 1 0 0 1 0 0 0 0 0 1 1 1 0 0 1 1 0 0 0
Problem: encode a length-𝑛 vector to 𝑛 + 1 bit s.t. every zeros run < ⌈log 𝑛⌉ + 1Example: 𝑛 = 30, ⌈log 𝑛⌉ + 1 = 6
62
Zeros Run Length Limited Encoding
0 1 1 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 0 1
0 1 1 0 1 1 0 1 1 0 1 1 1 0 0 1 0 0 0 0 0 1 1 1 0 0 1 1 0 0 0
Uniquely decodable
Problem: encode a length-𝑛 vector to 𝑛 + 1 bit s.t. every zeros run < ⌈log 𝑛⌉ + 1Example: 𝑛 = 30, ⌈log 𝑛⌉ + 1 = 6
63
Zeros Run Length Limited Encoding
0 1 1 0 1 1 0 1 1 0 1 1 1 0 0 1 0 0 0 0 0 1 1 1 0 0 1 1 0 0 0
Zeroes run length ≤ ⌈log 𝑛⌉
Problem: encode a length-𝑛 vector to 𝑛 + 1 bit s.t. every zeros run < ⌈log 𝑛⌉ + 1Example: 𝑛 = 30, ⌈log 𝑛⌉ + 1 = 6
64
Zeros Run Length Limited Encoding
0 1 1 0 1 1 0 1 1 0 1 1 1 0 0 1 0 0 0 0 0 1 1 1 0 0 1 1 0 0 0
Zeroes run length ≤ ⌈log 𝑛⌉
Problem: encode a length-𝑛 vector to 𝑛 + 1 bit s.t. every zeros run < ⌈log 𝑛⌉ + 1Example: 𝑛 = 30, ⌈log 𝑛⌉ + 1 = 6
65
Zeros Run Length Limited Encoding
0 1 1 0 1 1 0 1 1 0 1 1 1 0 0 1 0 0 0 0 0 1 1 1 0 0 1 1 0 0 0
Zeroes run length ≤ ⌈log 𝑛⌉
Problem: encode a length-𝑛 vector to 𝑛 + 1 bit s.t. every zeros run < ⌈log 𝑛⌉ + 1Example: 𝑛 = 30, ⌈log 𝑛⌉ + 1 = 6
66
Zeros Run Length Limited Encoding
0 1 1 0 1 1 0 1 1 0 1 1 1 0 0 1 0 0 0 0 0 1 1 1 0 0 1 1 0 0 0
Zeroes run length ≤ ⌈log 𝑛⌉
Problem: encode a length-𝑛 vector to 𝑛 + 1 bit s.t. every zeros run < ⌈log 𝑛⌉ + 1Example: 𝑛 = 30, ⌈log 𝑛⌉ + 1 = 6
67
Zeros Run Length Limited Encoding
0 1 1 0 1 1 0 1 1 0 1 1 1 0 0 1 0 0 0 0 0 1 1 1 0 0 1 1 0 0 0
𝒊 ≠ 𝟎
Zeroes run length ≤ ⌈log 𝑛⌉
Problem: encode a length-𝑛 vector to 𝑛 + 1 bit s.t. every zeros run < ⌈log 𝑛⌉ + 1Example: 𝑛 = 30, ⌈log 𝑛⌉ + 1 = 6
68
Zeros Run Length Limited Encoding
0 1 1 0 1 1 0 1 1 0 1 1 1 0 0 1 0 0 0 0 0 1 1 1 0 0 1 1 0 0 0
Problem: encode a length-𝑛 vector to 𝑛 + 1 bit s.t. every zeros run < ⌈log 𝑛⌉ + 1Example: 𝑛 = 30, ⌈log𝑛⌉ + 1 = 6
Zeroes run length ≤ ⌈log 𝑛⌉
69
Zeros Run Length Limited Encoding
0 1 1 0 1 1 0 1 1 0 1 1 1 0 0 1 0 0 0 0 0 1 1 1 0 0 1 1 0 0 0
Problem: encode a length-𝑛 vector to 𝑛 + 1 bit s.t. every zeros run < ⌈log 𝑛⌉ + 1Example: 𝑛 = 30, ⌈log𝑛⌉ + 1 = 6
Zeroes run length ≤ ⌈log 𝑛⌉
𝒊 𝒋≤
70
Zeros Run Length Limited Encoding
0 1 1 0 1 1 0 1 1 0 1 1 1 0 0 1 0 0 0 0 0 1 1 1 0 0 1 1 0 0 0
Problem: encode a length-𝑛 vector to 𝑛 + 1 bit s.t. every zeros run < ⌈log 𝑛⌉ + 1Example: 𝑛 = 30, ⌈log𝑛⌉ + 1 = 6
Zeroes run length ≤ ⌈log 𝑛⌉
𝒊 𝒋≤
71
Zeros Run Length Limited Encoding
0 1 1 0 1 1 0 1 1 0 1 1 1 0 0 1 0 0 0 0 0 1 1 1 0 0 1 1 0 0 0
Problem: encode a length-𝑛 vector to 𝑛 + 1 bit s.t. every zeros run < ⌈log 𝑛⌉ + 1Example: 𝑛 = 30, ⌈log𝑛⌉ + 1 = 6
Zeroes run length ≤ ⌈log 𝑛⌉
𝒊 𝒋≤
72
Zeros Run Length Limited Encoding
0 1 1 0 1 1 0 1 1 0 1 1 1 0 0 1 0 0 0 0 0 1 1 1 0 0 1 1 0 0 0
Problem: encode a length-𝑛 vector to 𝑛 + 1 bit s.t. every zeros run < ⌈log 𝑛⌉ + 1Example: 𝑛 = 30, ⌈log𝑛⌉ + 1 = 6
Zeroes run length ≤ ⌈log 𝑛⌉
73
MU Codes Summary
2𝑛
2𝑒𝑛
Cardinality
Upper bound by Levenshtein , ’70
2𝑛
𝑒𝑛Construction
cardinality, 𝑛 = 2𝑖
log 𝑒 + log 𝑛 + 1 log 𝑒 + log 𝑛Redundancy
74
MU Codes Summary
2𝑛
2𝑒𝑛
Cardinality
Upper bound by Levenshtein , ’70
2𝑛
𝑒𝑛Construction
cardinality, 𝑛 = 2𝑖
log 𝑒 + log 𝑛 + 1 log 𝑒 + log 𝑛Redundancy log 𝑛 + 4
2𝑛
16𝑛
Efficient construction, 𝑛 = 2𝑖
75
MU Codes Summary
2𝑛
2𝑒𝑛
Cardinality
Upper bound by Levenshtein , ’70
2𝑛
𝑒𝑛Construction
cardinality, 𝑛 = 2𝑖
log 𝑒 + log 𝑛 + 1 log 𝑒 + log 𝑛Redundancy log 𝑛 + 4
2𝑛
16𝑛
Efficient construction, 𝑛 = 2𝑖
• W. Kautz, "Fibonacci codes for synchronization control," IEEE Transactions on Information Theory, vol. 11, no. 2, pp. 284-292, 1965
• C. Schoeny, A. Wachter-Zeh, R. Gabrys, and E. Yaakobi, “Codes for correcting a burst of deletions or insertions,” in Proc. IEEE Int. Symp. Inf. Theory, pp. 630–634, Barcelona, Spain, Jul. 2016 76
Non Fixed Zero Run Length Analysis
• 𝑺 𝒏,𝒌 - length 𝑛 vectors with no zeroes runs of length 𝑘, 𝒔 𝒏,𝒌 = 𝑺 𝒏,𝒌
• 𝑪 𝒏,𝒌 = 𝒔 𝒏− 𝒌 − 𝟐, 𝒌
• The capacity of (0, 𝑘 − 1)-RLL constraint:
𝐸0,𝑘−1 = limℓ→∞
log 𝑠 ℓ,𝑘
ℓ, for fixed k
𝒔 𝒏, 𝐥𝐨𝐠𝒏 + 𝒂 ≈ ?
𝒂 ∈ ℤ
0 0 0 0 0 0 0 0 1 1
77
Non Fixed Zero Run Length Analysis
𝑆 𝑚𝑛, 𝑘 ⊆ 𝑆 𝑛, 𝑘 𝑚
𝑠 𝑚𝑛, 𝑘 ≤ 𝑠 𝑛, 𝑘 𝑚
𝑚 times
𝒔 𝒏, 𝒍𝒐𝒈 𝒏 + 𝒂 ≈ ?
𝑺 𝒏,𝒌• 𝑺 𝒏, 𝒌 - length 𝑛 vectors with no zeroes runs of length 𝑘, 𝒔 𝒏, 𝒌 = 𝑺 𝒏, 𝒌
• 𝑻 𝒏,𝒌 - length 𝑛 vectors with no zeroes runs of length 𝑘 that does not
contain ⌈𝐤⌉
𝟐zeroes in the first or last indexes, 𝒕 𝒏, 𝒌 = 𝑻 𝒏, 𝒌
• 𝑡 𝑛, 𝑘 ≥ 𝑠 𝑛, 𝑘 − 2 ⋅ 2𝑛−𝑘
2+1
∈ 𝑆 𝑛, 𝑘 𝑚
∈ 𝑆(𝑚𝑛,𝑘)
78
Non Fixed Zero Run Length Analysis 𝒔 𝒏, 𝒍𝒐𝒈 𝒏 + 𝒂 ≈ ?
𝑻 𝒏,𝒌• 𝑺 𝒏, 𝒌 - length 𝑛 vectors with no zeroes runs of length 𝑘, 𝒔 𝒏, 𝒌 = 𝑺 𝒏, 𝒌
• 𝑻 𝒏,𝒌 - 𝑺(𝒏, 𝒌)\{vectors that contain ⌈𝐤⌉
𝟐zeroes in the first or last indexes},
𝒕 𝒏, 𝒌 = 𝑻 𝒏,𝒌
• # of removed ≤ 2 ⋅ 2𝑛−
𝑘
2+1
𝑇 𝑛, 𝑘 𝑚 ⊆ 𝑆 𝑚𝑛, 𝑘
𝑠 𝑛, 𝑘 − 2 ⋅ 2𝑛−
𝑘2+1
𝑚
≤ 𝑡 𝑛, 𝑘 𝑚 ≤ 𝑠 𝑚𝑛,𝑘
𝑚 times
∈ 𝑇 𝑛, 𝑘 𝑚
∈ 𝑆(𝑚𝑛,𝑘)
79
Non Fixed Zero Run Length Analysis
• 𝑠 𝑛, 𝑘 − 2 ⋅ 2𝑛−
𝑘
2+1
𝑚
≤ 𝑠 𝑚𝑛, 𝑘 ≤ 𝑠 𝑛, 𝑘 𝑚
• The capacity of 0,𝑘 − 1 −RLL constraint:𝐸0,𝑘−1 = limℓ→∞
log 𝑠 ℓ,𝑘
ℓ
= lim𝑚→∞
log 𝑠 𝑚𝑛,𝑘
𝑚𝑛
• 2𝑛𝐸0,𝑘−1 ≤ 𝑠 𝑛, 𝑘 ≤ 2𝑛𝐸0,𝑘−1 + 2𝑛−
𝑘
2+1
• 𝑠 𝑛, 𝑙𝑜𝑔 𝑛 + 𝑎 ≈ 2𝑛𝐸0,𝑘−1
𝒔 𝒏, 𝒍𝒐𝒈 𝒏 + 𝒂 ≈ ?
80
Non Fixed Zero Run Length Analysis
𝒔 𝒏, 𝒍𝒐𝒈 𝒏 + 𝒂 ≈ 𝟐𝒏𝑬𝟎, 𝐥𝐨𝐠 𝒏 +𝒂−𝟏
A. Kato and K. Zeger ’05:
lim𝑘→∞
1−𝐸0,𝑘
log 𝑒 2−𝑘−2= 1 𝟐𝒏𝑬𝟎, 𝐥𝐨𝐠 𝒏 +𝒂−𝟏 ≈
𝟐𝒏
𝒆𝟐𝚫𝒏−𝒂−𝟏
, Δ𝑛= log𝑛 − ⌈log𝑛⌉
𝒔 𝒏, ⌈𝐥𝐨𝐠𝒏⌉ + 𝒂 ≈𝟐𝒏
𝒆𝟐𝚫𝒏−𝒂−𝟏
Δ𝑛= log𝑛 − ⌈log𝑛⌉ ∈ (−1,0]
81
Construction of MU Codes
∀𝑘 = log 𝑛 + 𝑎:
𝑪 𝒏,𝒌 ≈𝟐𝒏
𝒏𝟐𝒇𝒂(Δn), Δ𝑛= log𝑛 − ⌈log𝑛⌉ ,
𝑓𝑎Δ = Δ −
log e
2a+12Δ − 𝑎 − 2
𝐦𝐚𝐱𝒌
|𝑪 𝒏, 𝒌 | ≈𝟐𝒏
𝒏𝟐𝑭 Δ𝑛 ≤
𝟐𝒏
𝟐𝒆𝒏,
Δ𝑛 = log𝑛 − ⌈log𝑛⌉ ,𝐹 Δ = Δ −min 2Δ log𝑒 + 1, 2Δ+1 log𝑒
82
Outline
• Motivation
• Mutually Uncorrelated Codes• Well- known construction analysis
• Non fixed Run Length Limited constraint
• Efficient construction
• 𝒅𝒉, 𝒅𝒎 − Mutually Uncorrelated Codes• Definition
• Upper bound on cardinality
• Efficient construction
• Ongoing and future work
83
𝑑ℎ , 𝑑𝑚 −Mutually Uncorrelated Codes
84
𝑑ℎ , 𝑑𝑚 −Mutually Uncorrelated Codes
• Minimum Hamming distance 𝑑ℎ
85
𝑑ℎ , 𝑑𝑚 −Mutually Uncorrelated Codes
• Minimum Hamming distance 𝑑ℎ• Each prefix of length 𝑖 ∈ [1, 𝑛 − 1] differs from each suffix of length 𝑖
by min 𝑖, 𝑑𝑚 symbols
86
𝑑ℎ , 𝑑𝑚 −Mutually Uncorrelated Codes
• Minimum Hamming distance 𝑑ℎ• Each prefix of length 𝑖 ∈ [1, 𝑛 − 1] differs from each suffix of length 𝑖
by min 𝑖, 𝑑𝑚 symbols
u1 u2 u3 u4 un
v1 v2 v3 v4 vn
87
Theorem: Let 𝐶 be a 𝑑ℎ, 𝑑𝑚 -MU code.
|𝑪| ≤𝑀 𝑛, 𝑑
𝑛𝑑𝑚
, 𝑑 = min{𝑑ℎ, 2𝑑𝑚}
𝑴 𝒏, 𝒅 - size of a maximal code of length 𝒏 with Hamming distance 𝑑.
Proof: Let 𝐶 be a 𝑑ℎ, 𝑑𝑚 -MU code. Generate 𝐶′ = 𝑎 𝑖 𝑎 ∈ 𝐶, 𝑖 = 𝑥 ⋅ 𝑑𝑚 +1}
a1 adm a2dm a3dm a4dm
a2dm a3dm a4dm a1 adm
a3dm a4dm a1 adm a2dm
a4dm a1 adm a2dm a3dm
𝑑ℎ , 𝑑𝑚 − MU Codes Upper Bound
88
Let 𝒂′, 𝒃′ ∈ 𝐶 ′
s.t 𝒂′ is a shift of 𝒂 ∈ 𝐶, 𝒃′ is a shift of 𝒃 ∈ 𝐶
adm+1 a1 adm
bdm+1 b1 b𝑑𝑚
Hamming distance ≥ 𝑑ℎ
𝑑ℎ , 𝑑𝑚 − MU Codes Upper Bound
𝒂′
𝒃′
89
a4dm a1
b4dm b1
Hamming distance ≥ 2𝑑𝑚
𝑑ℎ , 𝑑𝑚 − MU Codes Upper Bound
𝒂′
𝒃′
Let 𝒂′, 𝒃′ ∈ 𝐶 ′
s.t 𝒂′ is a shift of 𝒂 ∈ 𝐶, 𝒃′ is a shift of 𝒃 ∈ 𝐶
90
Let 𝐶 be a 𝑑ℎ , 𝑑𝑚 -MU code. Generate a new code: 𝐶 ′ = 𝑎 𝑖 𝑎 ∈ 𝐶, 𝑖 = 𝑥 ⋅ 𝑑𝑚 + 1}
• 𝐶′ =𝑛
𝑑𝑚⋅ 𝐶
• 𝑑𝑚𝑖𝑛 𝐶′ ≥ 𝑑 =min{𝑑ℎ, 2𝑑𝑚}• 𝐶′ ≤ 𝑀 𝑛, 𝑑 where 𝑴 𝒏,𝒅 - size of a maximal code with Hamming distance 𝑑
|𝑪| ≤𝑀 𝑛, 𝑑
𝑛𝑑𝑚
, 𝑑 = min{𝑑ℎ , 2𝑑𝑚}
a1 adm a2dm a3dm a4dm
𝑑ℎ , 𝑑𝑚 − MU Codes Upper Bound
91
0 0 0 0 0 0 0 0 u u u u u u 1 1 1 1 1 1 1 1
𝑘 zeros A code of minimum distance 𝒅𝒉The weight of every length-k window is ≥ 𝒅𝒎
𝑢 vector 𝒅𝒎 ones 𝒅𝒎 ones
𝑑ℎ , 𝑑𝑚 −MU Codes Construction
• Minimum Hamming distance 𝑑ℎ• Each prefix of length 𝑖 ∈ [1, 𝑛 − 1] differs from each suffix of length 𝑖
by min 𝑖, 𝑑𝑚 symbols
92
Window Weight Limited Encoding
Problem: encode a length-𝑛 vector to 𝑛 + 𝑑𝑚 bit s.t. every window of length < ⌈log 𝑛⌉ + (𝑑𝑚−1)log log 𝑛 + 𝐶 has weight ≥ 𝑑𝑚 .
1 1 1 1 1
93
Problem: encode a length-𝑛 vector to 𝑛 + 𝑑𝑚 bit s.t. every window of length < ⌈log 𝑛⌉ + (𝑑𝑚−1)log log 𝑛 + 𝐶 has weight ≥ 𝑑𝑚 .
1 1 1 1 1
Window Weight Limited Encoding
94
Window Weight Limited Encoding
Problem: encode a length-𝑛 vector to 𝑛 + 𝑑𝑚 bit s.t. every window of length < ⌈log 𝑛⌉ + (𝑑𝑚−1)log log 𝑛 + 𝐶 has weight ≥ 𝑑𝑚 .
1 1 1 1 1 0 1
Index of the windowlog 𝑛 bits
(𝑑𝑚−1) indexes of log log 𝑛 bits of the ones within the window
95
0 0 0 0 0 0 0 0 u u u u u u 1 1 1 1 1 1 1 1
A code of minimum distance 𝑑ℎThe weight of every length-𝑘 window is ≥ 𝑑𝑚
𝑥 ∈ 𝔽2𝑛′
Window Weight Limited
encoding
𝑦 ∈ 𝔽2𝑛′+𝑑𝑚
Systematic BCH
𝑧 of length
𝑛′ + 𝑑𝑚 +𝑑ℎ − 1
2𝑙𝑜𝑔𝑛′
𝟎𝒌𝒖𝟏𝒅𝒎𝒛𝟏𝒅𝒎
𝑑ℎ , 𝑑𝑚 −MU Codes Efficient Construction
Theorem: There exists a 𝑑ℎ , 𝑑𝑚 -MU code with redundancy𝑑ℎ+1
2log 𝑛 + 𝑑𝑚 − 1 log log𝑛 + 𝑂(1) and linear time and space complexity
96
RedundancyLower bound
(for 𝟐𝒅𝒎 ≥ 𝒅𝒉)
𝑑ℎ + 1
2log 𝑛 + 𝑂(1)
𝑑ℎ , 𝑑𝑚 −Mutually Uncorrelated Codes
97
RedundancyLower bound
(for 𝟐𝒅𝒎 ≥ 𝒅𝒉)Efficient
Construction
𝑑ℎ + 1
2log 𝑛 + 𝑑𝑚 − 1 log log 𝑛 + 𝑂(1)
𝑑ℎ + 1
2log 𝑛 + 𝑂(1)
𝑑ℎ , 𝑑𝑚 −Mutually Uncorrelated Codes
98
99
• Strengthen bounds of 𝑑ℎ , 𝑑𝑚 −MU codes
• Explore the 2 factor gap between MU upper and lower bound
• Analyze the window weight limited constraint with non fixed window length
• Extend to additional DNA motivated constraints such as balanced codes and edit distance
2𝑛
2𝑒𝑛
CardinalityUpper bound by Levenshtein , ’70
2𝑛
𝑒𝑛
Construction cardinality, 𝑛 = 2𝑖
Ongoing and Future Work
100
• Strengthen bounds of 𝑑ℎ , 𝑑𝑚 −MU codes
• Explore the 2 factor gap between MU upper and lower bound
• Analyze the window weight limited constraint with non fixed window length
• Extend to additional DNA motivated constraints such as balanced codes and edit distance
0 0 0 0 0 0 0 0 u u u u u u 1 1 1 1 1 1 1 1
The weight of every length-k window is ≥ 𝒅𝒎
2𝑛
2𝑒𝑛
CardinalityUpper bound by Levenshtein , ’70
2𝑛
𝑒𝑛
Construction cardinality, 𝑛 = 2𝑖
Ongoing and Future Work
101
• Strengthen bounds of 𝑑ℎ , 𝑑𝑚 −MU codes
• Explore the 2 factor gap between MU upper and lower bound
• Analyze the window weight limited constraint with non fixed window length
• Extend to additional DNA motivated constraints such as balanced codes and edit distance
0 0 0 0 0 0 0 0 u u u u u u 1 1 1 1 1 1 1 1
The weight of every length-k window is ≥ 𝒅𝒎
2𝑛
2𝑒𝑛
CardinalityUpper bound by Levenshtein , ’70
2𝑛
𝑒𝑛
Construction cardinality, 𝑛 = 2𝑖
Ongoing and Future Work
102
• Strengthen bounds of 𝑑ℎ , 𝑑𝑚 −MU codes
• Explore the 2 factor gap between MU upper and lower bound
• Analyze the window weight limited constraint with non fixed window length
• Extend to additional DNA motivated constraints such as balanced codes and edit distance
0 0 0 0 0 0 0 0 u u u u u u 1 1 1 1 1 1 1 1
The weight of every length-k window is ≥ 𝒅𝒎
2𝑛
2𝑒𝑛
CardinalityUpper bound by Levenshtein , ’70
2𝑛
𝑒𝑛
Construction cardinality, 𝑛 = 2𝑖
Ongoing and Future Work
103
THANK YOU
Thanks to Ryan Gabrys and Olgica Milenkovic for helpful discussions on DNA storage104