Implementing a data lake—“a massive, easily accessible data repository built on (relatively) inexpensive computer hardware for storing ‘big data’”— can avoid duplication of efforts, encompass a company’s full data store, simplify data acquisition and storage, and democratize data within the enterprise. For details, see our previous post, What is a Data Lake? What are the […]
Continue ReadingWhen you use the cloud for heavy duty engineering computations, you quickly find out that although storage is inexpensive, you never know how fast you can access your data. If you want guaranteed performance in terms of IOPS, the price quickly goes up. This has to do with the distance your data has to travel. […]
Continue ReadingHere is a common scenario — a huge, multi-year project is finally completed, the product is shipped, and the team moves on to a new project or new version. Months or years later, a problem comes up that has been solved before, and one unlucky engineering team is tasked with finding a solution embedded in […]
Continue ReadingOur working memory can hold only 7 ± 2 chunks of information. When the number of chunks increases, we recode the information by breaking it into categories that each contains 7 ± 2 chunks. We also saw how in the case of data we prefer to use a hierarchical system of managing files in which […]
Continue Reading