\`x^2+y_1+z_12^34\`
Advanced Search
Article Contents
Article Contents

Architecture-aware coding for distributed storage: Repairable block failure resilient codes

  • * Corresponding author: Gokhan Calis

    * Corresponding author: Gokhan Calis 
G. Calis is with Western Digital Corporation, San Jose, CA 95119. O. Ozan Koyluoglu is with University of California, Berkeley, CA 94720. The paper was submitted when both authors were with Department of Electrical and Computer Engineering, University of Arizona, Tucson, AZ 85712. This paper was in part presented at 2014 IEEE International Symposium on Information Theory (ISIT 2014), Honolulu, HI, June 2014. This work is supported in part by the National Science Foundation under Grants No CCF-1563622 and CNS-1617335.
Abstract / Introduction Full Text(HTML) Figure(11) / Table(1) Related Papers Cited by
  • In large scale distributed storage systems (DSS) deployed in cloud computing, correlated failures resulting in simultaneous failure (or, unavailability) of blocks of nodes are common. In such scenarios, the stored data or a content of a failed node can only be reconstructed from the available live nodes belonging to the available blocks. To analyze the resilience of the system against such block failures, this work introduces the framework of Block Failure Resilient (BFR) codes, wherein the data (e.g., a file in DSS) can be decoded by reading out from a same number of codeword symbols (nodes) from a subset of available blocks of the underlying codeword. Further, repairable BFR codes are introduced, wherein any codeword symbol in a failed block can be repaired by contacting a subset of remaining blocks in the system. File size bounds for repairable BFR codes are derived, and the trade-off between per node storage and repair bandwidth is analyzed, and the corresponding minimum storage regenerating (BFR-MSR) and minimum bandwidth regenerating (BFR-MBR) points are derived. Explicit codes achieving the two operating points for a special case of parameters are constructed, wherein the underlying regenerating codewords are distributed to BFR codeword symbols according to combinatorial designs. Finally, BFR locally repairable codes (BFR-LRC) are introduced, an upper bound on the resilience is derived and optimal code construction are provided by a concatenation of Gabidulin and MDS codes. Repair efficiency of BFR-LRC is further studied via the use of BFR-MSR/MBR codes as local codes. Code constructions achieving optimal resilience for BFR-MSR/MBR-LRCs are provided for certain parameter regimes. Overall, this work introduces the framework of block failures along with optimal code constructions, and the study of architecture-aware coding for distributed storage systems.

    Mathematics Subject Classification: Primary: 58F15, 58F17; Secondary: 53C35.

    Citation:

    \begin{equation} \\ \end{equation}
  • 加载中
  • Figure 1.  A data-center architecture where top-of-rack (TOR) switch of the first rack fails

    Figure 2.  (a) Node repair in regenerating codes. (b) Node repair in relaxation of regenerating codes. (c) Trade-off curves in toy example.

    Figure 3.  Repair process for $b = 2$ (two blocks) case

    Figure 4.  (a) Ratio $\frac{\gamma^{\textrm{k-odd}}_{\textrm{BFR-MSR}}}{\gamma_{\textrm{MBR}}}$ vs. $d$. (b) Ratio $\frac{\gamma^{\textrm{k-even}}_{\textrm{BFR-MSR}}}{\gamma_{\textrm{MBR}}}$ vs. $d$

    Figure 5.  Transpose code is a two-block BFR-MBR code

    Figure 6.  (a) Matrix representation of block design. (b) Threeblock BFR-RC

    Figure 7.  Illustrating the construction of BFR codes using projective plane based placement of regenerating codes. ($n' = \tilde{n}-\frac{\tilde{n}}{r}+1.)$

    Figure 8.  Data center architecture. Each rack resembles a block and each cluster forms a local group for node repair operations in the BFR model

    Figure 9.  Repair delay vs. storage overhead comparisons for $b = 7$, $n = 21$ and $\sigma = 3$. (a) Data points for possible cases of each node. (b) Lower envelope of Fig. 9a when zoomed in

    Figure 10.  Repair delay vs. storage overhead comparisons for $b = 7$ and $\sigma = 3$, (a) when $n = 21$, (b) when $n = 35$

    Figure 11.  Construction of a set $\mathcal{B}^*$ with $H(\mathbf{c}(\mathcal{B}^*)) <\mathcal{M}$ for BFR-LRC

    Table 1.  Summary of the contributions

    Characterization of BFR points $d_r \geq k_c $ and $\sigma <\rho$Section 3.1.1
    Characterization of BFR points $d_r \geq k_c $ and $\sigma \geq \rho$Section 3.1.2
    Characterization of BFR points $d_r < k_c $ and $\sigma < \rho$Section 3.2
    BFR codes $b=2$, $d_r \geq k_c $, $\sigma=1$ and $\rho=0$Section 4.1
    BFR codes $d_r \geq k_c $, $\sigma=1$ and $\rho=0$Section 4.2
    BFR codes $d_r \geq k_c $, $\sigma < b-1$ and $\rho = 0$Section 4.3
    BFR-LRC and code constructionsresilience $\rho$Section 5.1, 5.2
    Local regeneration in BFR-LRC $d_r \geq k_c $, $\sigma < b-1$ and $\rho_L = 0$2Section 5.3
    Relaxed BFRany set of parametersSection 6.2
     | Show Table
    DownLoad: CSV
  • [1] M. BlaumJ. L. Hafner and S. Hetzler, Partial-MDS codes and their application to RAID type of architectures, IEEE Transactions on Information Theory, 59 (2013), 4510-4519.  doi: 10.1109/TIT.2013.2252395.
    [2] M. Blaum and J. S. Plank, Construction of two SD codes, arXiv: 1305.1221.
    [3] V. R. Cadambe, C. Huang and J. Li, Permutation code: Optimal exact-repair of a single failed node in MDS code based distributed storage systems, Proc. 2011 IEEE International Symposium on Information Theory (ISIT 2011), (2011). doi: 10.1109/ISIT.2011.6033730.
    [4] G. Calis and O. O. Koyluoglu, A general construction for PMDS codes, IEEE Communications Letters, 21 (2017), 452-455.  doi: 10.1109/LCOMM.2016.2627569.
    [5] G. Calis and O. O. Koyluoglu, Repairable block failure resilient codes, Proc. 2014 IEEE International Symposium on Information Theory (ISIT 2014), (2014). doi: 10.1109/ISIT.2014.6875271.
    [6] A. G. DimakisP. B. GodfreyY. WuM. J. Wainwright and K. Ramchandran, Network coding for distributed storage systems, IEEE Transactions on Information Theory, 56 (2010), 4539-4551. 
    [7] A. G. DimakisK. RamchandranY. Wu and S. Changho, A survey on network codes for distributed storage, Proc. of the IEEE, 99 (2011), 476-489.  doi: 10.1109/JPROC.2010.2096170.
    [8] D. Ford, F. Labelle, F. I. Popovici, M. Stokely, V. A. Truong, L. Barroso, C. Grimes and S. Quinlan, Availability in globally distributed storage systems, Proc. Ninth USENIX Symposium on Operating Systems Design and Implementation (OSDI'10), (2010).
    [9] E. M. Gabidulin, Theory of codes with maximum rank distance, Problemy Peredachi Informatsii, 21 (1985), 3-16. 
    [10] B. Gaston, J. Pujol and M. Villanueva, A realistic distributed storage system: The rack model, arXiv: 1302.5657.
    [11] S. Ghemawat, H. Gobioff and S. T. Leung, The Google file system, Proc. Nineteenth ACM Symposium on Operating Systems Principles, (2003). doi: 10.1145/945445.945450.
    [12] P. GopalanH. ChengH. Simitci and S. Yekhanin, On the locality of codeword symbols, IEEE Transactions on Information Theory, 58 (2012), 6925-6934.  doi: 10.1109/TIT.2012.2208937.
    [13] T. HoM. MedardR. KoetterD. R. KargerM. EffrosJ. Shi and B. Leong, A random linear network coding approach to multicast, IEEE Transactions on Information Theory, 52 (2006), 4413-4430.  doi: 10.1109/TIT.2006.881746.
    [14] C. Huang, H. Simitci, Y. Xu, A. Ogus, B. Calder, P. Gopalan, J. Li and S. Yekhanin, Erasure coding in Windows azure storage, Proc. USENIX Annual Technical Conference, 2012.
    [15] C. Huang, M. Chen and J. Li, Pyramid codes: Flexible schemes to trade space for access efficiency in reliable data storage systems, Proc. 2007 Sixth IEEE International Symposium on Network Computing and Applications (IEEE NCA07), (2007).
    [16] Y. IshaiE. KushilevitzR. Ostrovsky and A. Sahai, Batch codes and their application, Proc. Thirty-sixth Annual ACM Symposium on Theory of Computing, (2004), 262-271.  doi: 10.1145/1007352.1007396.
    [17] G. M. KamathN. PrakashV. Lalitha and P. V. Kumar, Codes with local regeneration and erasure correction, IEEE Transactions on Information Theory, 60 (2014), 4637-4660.  doi: 10.1109/TIT.2014.2329872.
    [18] G. M. Kamath, N. Silberstein, N. Prakash, A. S. Rawat, V. Lalitha, O. O. Koyluoglu, P. V. Kumar and S. Vishwanath, Explicit MBR all-symbol locality codes, Proc. 2013 IEEE International Symposium on Information Theory (ISIT 2013), (2013). doi: 10.1109/ISIT.2013.6620277.
    [19] O. O. KoyluogluA. S. Rawat and S. Vishwanath, Secure cooperative regenerating codes for distributed storage systems, IEEE Transactions on Information Theory, 60 (2014), 5228-5244.  doi: 10.1109/TIT.2014.2319271.
    [20] F. J. MacWilliams and N. J. A. Sloane, The theory for error-correcting codes, Amsterdam-New York-Oxford, 1977.
    [21] F. Oggier and A. Datta, Self-repairing homomorphic codes for distributed storage systems, Proc. 2011 IEEE INFOCOM, (2011). doi: 10.1109/INFCOM.2011.5934901.
    [22] O. Olmez and A. Ramamoorthy, Fractional repetition codes with flexible repair from combinatorial designs, IEEE Transactions on Information Theory, 62 (2016), 1565-1591.  doi: 10.1109/TIT.2016.2531720.
    [23] D. S. Papailiopoulos and A. G. Dimakis, Locally repairable codes, IEEE International Symposium on Information Theory, 60 (2014), 5843-5855.  doi: 10.1109/TIT.2014.2325570.
    [24] D. S. PapailiopoulosA. G. Dimakis and V. R. Cadambe, Repair optimal erasure codes through Hadamard designs, IEEE Transactions on Information Theory, 59 (2013), 3021-3037.  doi: 10.1109/TIT.2013.2241819.
    [25] N. Prakash, V. Abdrashitov and M. Médard, The storage vs repair-bandwidth trade-off for clustered storage systems, IEEE Transactions on Information Theory (Early Access), (2008), 1–1, arXiv: 1701.04909. doi: 10.1109/TIT.2018.2806342.
    [26] N. Prakash, G. M. Kamath, V. Lalitha and P. V. Kumar, Optimal linear codes with a local-error-correction property, Proc. 2012 IEEE International Symposium on Information Theory (ISIT 2012), (2012). doi: 10.1109/ISIT.2012.6284028.
    [27] K. V. RashmiN. B. Shah and P. V. Kumar, Optimal exact-regenerating codes for distributed storage at the MSR and MBR points via a product-matrix construction, IEEE Transactions on Information Theory, 57 (2011), 5227-5239.  doi: 10.1109/TIT.2011.2159049.
    [28] K. V. Rashmi, N. B. Shah and P. V. Kumar, Enabling node repair in any erasure code for distributed storage, Proc. 2011 IEEE International Symposium on Information Theory (ISIT 2011), (2011). doi: 10.1109/ISIT.2011.6033732.
    [29] A. S. RawatO. O. KoyluogluN. Silberstein and S. Vishwanath, Optimal locally repairable and secure codes for distributed storage systems, IEEE Transactions on Information Theory, 60 (2014), 212-236.  doi: 10.1109/TIT.2014.2319271.
    [30] A. S. Rawat, D. S. Papailiopoulos, A. G. Dimakis and S. Vishwanath, Locality and availability in distributed storage, IEEE Trans. Inform. Theory, 62 (2016), 4481–4493, arXiv: 1402.2011. doi: 10.1109/TIT.2016.2524510.
    [31] S. E. Rouayheb and K. Ramchandran, Fractional repetition codes for repair in distributed storage systems, Proc. 48th Annual Allerton Conference on communication, control and computing, (2010). doi: 10.1109/ALLERTON.2010.5707092.
    [32] M. SathiamoorthyM. AsterisD. S. PapailiopoulosA.G. DimakisR. VadaliS. Chen and D. Borthakur, XORing elephants: Novel erasure codes for big data, Proc. VLDB Endow., 6 (2013), 325-336. 
    [33] N. Silberstein and T. Etzion, Optimal fractional repetition codes based on graphs and designs, IEEE Transactions on Information Theory, 61 (2015), 4164-4180.  doi: 10.1109/TIT.2015.2442231.
    [34] N. Silberstein, A. S. Rawat and S. Vishwanath, Error resilience in distributed storage via rank-metric codes, Proc. 50th Annual Allerton Conference on communication, control and computing, (2012). doi: 10.1109/Allerton.2012.6483348.
    [35] N. SilbersteinA. S. Rawat and S. Vishwanath, Error-correcting regenerating and locally repairable codes via rank-metric codes, IEEE Transactions on Information Theory, 61 (2015), 5765-5778.  doi: 10.1109/TIT.2015.2480848.
    [36] D. R. Stinson, Combinatorial designs: Construction and analysis, ACM SIGACT News, 39 (2008), 17-21.  doi: 10.1145/1466390.1466393.
    [37] I. Tamo and A. Barg, A family of optimal locally recoverable codes, IEEE Transactions on Information Theory, 60 (2014), 4661-4676.  doi: 10.1109/TIT.2014.2321280.
    [38] I. TamoW. Zhiying and J. Bruck, Zigzag codes: MDS array codes with optimal rebuilding, IEEE Transactions on Information Theory, 59 (2013), 1597-1616.  doi: 10.1109/TIT.2012.2227110.
    [39] C. Tian, B. Sasidharan, V. Aggarwal, V. A. Vaishampayan and P. V. Kumar, Layered, exactrepair regenerating codes via embedded error correction and block designs, IEEE Trans. Inform. Theory, 61 (2015), 1933–1947, arXiv: 1408.0377. doi: 10.1109/TIT.2015.2408595.
    [40] A. Wang and Z. Zhang, Repair locality with multiple erasure tolerance, IEEE Transactions on Information Theory, 60 (2014), 6979-6987.  doi: 10.1109/TIT.2014.2351404.
    [41] Y. Wu, A. G. Dimakis and K. Ramchandran, Deterministic regenerating codes for distributed storage, Proc. 45th Annual Allerton Conference on Communication, Control and Computing, (2007).
    [42] Y. Wu, Existence and construction of capacity-achieving network codes for distributed storage, IEEE Journal on Selected Areas in Communications, 28 (2010), 277-288.  doi: 10.1109/ISIT.2009.5206008.
  • 加载中

Figures(11)

Tables(1)

SHARE

Article Metrics

HTML views(2084) PDF downloads(345) Cited by(0)

Access History

Other Articles By Authors

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return