|
Down to be stored in a table. An example of them are PDFs. Semistructured data cannot be managed in a standard way, but it does have defined elements, such as HTML formats. All this data can be collected directly or indirectly, depending on its origin. In fact, they can be generated by people; come from data transactions; of Internet browsing; be of biometric origin (from security, defense and intelligence services); or of the machine to machine type.
That is, having as origin the technologies that share data with devices. The operation of Big Data is summarized in a process, by which the data is collected from the appropriate sources and then transformed into the necessary moible number data format for its analysis. These transformations are carried out in ETL platforms , which process the data to end up loading it into the specified database. An example of an ETL platform is the Pentaho Data Integration.
Its application Spoon. Likewise, Big Data also includes the use of NoSQL storage systems, which are more flexible than relational databases and, therefore, allow a large number of data to be manipulated quickly. Finally, and once all this data is stored, you can proceed to its most appropriate analysis, according to the information you want to have and its possible applications. All this Big Data architecture makes it possible for the effective treatment of the information collected to help companies.
|
|