Category "bigdata"

How to improve my tables and queries for Big Data applications?

I created an API on Symfony which produces more than 1 million entries by day into one of the MySql tables. This table structure is defined this way: After s

Apache Spark - Is it possible to use a Dependency Injection Mechanism

Is there any possibility using a framework for enabling / using Dependency Injection in a Spark Application? Is it possible to use Guice, for instance? If so,

Azure data explorer update record

I am new to Azure data explorer and I am wondering how you can do update on a record in Azure data explorer using microsoft .NET SDK in C# ? The Microsoft docum

How to find items in a collections which are not in another collection with MongoDB

I want to query my mongodb to perform a non-match between 2 collections. Here is my structure : CollectionA : _id, name, firstname, website_account_key, emai

Sklearn-GMM on large datasets

I have a large data-set (I can't fit entire data on memory). I want to fit a GMM on this data set. Can I use GMM.fit() (sklearn.mixture.GMM) repeatedly on min