Page 123 - Building Big Data Applications
P. 123
120 Building Big Data Applications
Step 1dUser text for search
User inputs a string
User clicks submit button
Search engine receives input
Step 2dSearch engine process
Search engine opens up metadata and semantic libraries
User input is evaluated for one or more strings to look for appropriate metadata
Based on the data evaluation of the string, the search engine will need to
execute the loop of cycles to search.
Once the loop is determined, the string and its various combinations will be
passed into the search engine
Each search iteration will:
⁃ Cycle through a metadata library
⁃ Match the search string with the metadata library
⁃ Extract all web crawls that have been indexed and stored with tags and URL
addresses associated with the web crawl that the metadata matches
⁃ Compile a list to return to the user
⁃ Return the results to the loop execution logic
Add results to output segment
Return results once all loops are completed
The search process will do the following processes for each search:
⁃ Execute a neural network algorithm to perform the search and align metadata
⁃ Execute machine learning algorithms to align more search crawls for each
execution of search, this will work with unsupervised learning techniques
⁃ Execute indexing and semantic extract for each search process crawl
Step 3dReturn results
Each search process will return a list of web URL, and associated indexed match
with a rank (called page rank in Google for example)
Results will be spread across multiple pages to ensure all results are returned to
the user.
These steps form the basic flow and stream of activities to implement and it is then
developed into a set of front-end, algorithms, data search, metadata and back-end neural
networks, machine learning, and web crawl. This technique is what we need to under-
stand, the complexity in this process is in the neural networks, artificial intelligence, and
machine learning processes. The expansion of these algorithms is where the search
engine companies develop and deliver intellectual property and have several patents.
Each web user interaction from a search result can be further recorded as clickstream
and decoded on what results interested the users the most. This clickstream data when
collected together can be analyzed with integrations to search terms and associated
metadata. The data sets can be distributed across search applications and be leveraged
across different user interfaces.