Apache Spark : Python vs Scala

In Big Data Analysis,Apache Spark is one of the most popular framework .The Apache Spark is written in Scala .Apache  Spark has API’s for Scala, Python, Java and R , So we can work with any of these, but the popularly used languages are the Scala and Python. Java does not support Read-Evaluate-Print-Loop, and R […]

Read More

HIVE : A Warehousing Tool

Hive is basically a Data Warehouse Infrastructure Tool, which is used for processing structured data in Hadoop. Primarily used to summarize and manage Big Data, Hive helps make querying and analyzing easy. Hive data warehouse software facilitates querying and managing large datasets residing in distributed storage. Hive is a powerful tool for ETL, It is, however, relatively slow compared with traditional […]

Read More

Let’s Understand Data Lake, Data Warehouse and Database

“Data lakes, data warehouses, and databases “–All these are some terminologies used in Data Management. But what exactly their meaning is and are the same or differ from each other, let’s try to explore in this article.  We will start with the definitions, then will discuss key differences. A database is generic data storage and […]

Read More

What is Data Lake?

Data lakes are becoming increasingly important as people, especially in business and technology, want to perform broad data exploration and discovery. Bringing data together into a single place or most of it in a single place can be useful for that. A data lake is a place to store your structured and unstructured data, as well […]

Read More

How to Build Big Data Analytics Infrastructure

ref https://www.datasciencecentral.com/profiles/blogs/big-data-analytics-infrastructure   Big data can bring huge benefits to businesses of all sizes. However, as with any business project, proper preparation and planning is essential, especially when it comes to infrastructure. Until recently it was hard for companies to get into big data without making heavy infrastructure investments (expensive data warehouses, software, analytics staff, etc.). But […]

Read More

Everything about Kotlin

Kotlin is a general purpose, open source, statically typed “pragmatic” programming language for the JVM and Android that combines object-oriented and functional programming features. It is focused on interoperability, safety, clarity, and tooling support. Kotlin originated at JetBrains, the company behind IntelliJ IDEA, in 2010, and has been open source since 2012. Why use Kotlin for […]

Read More

10 Essential Books for Deep Learning

Deep learning is a significant part of what makes up the broader subject of machine learning. Still relatively new, its popularity is constantly growing and so it makes sense that people would want to read and learn more about the subject. If only there was a comprehensive list of such resources, organized in one place, […]

Read More

Understanding Fast Data and its Importance in an IoT-driven world

Internet of Things and now Industrial Internet of Things , both are making great impact in the World so lots of people are analyzing the impact the Internet of Things and the Industrial Internet of Things on the near future. How the things are changing by these technologies , how are they effecting daily life […]

Read More

11 Most Significant Tips for Learning Python Programming

Stack Overflow data indicates the increasing use of Python — possibly encouraged by its data science friendliness — has driven it to new levels of popularity, making it the “fastest-growing major programming language.” That conclusion comes from the popular coding Q&A site’s practice of drawing on years of data — collected from users seeking help […]

Read More