Hadoop and NoSQL

Utilize the power and scalability of open source software used by the big Internet companies

Big Data analytics

Build efficient data analytics platform in order to discover trends and understand data

Enterprise Collaboration Software

From Q&A service to collaborative tools for complex knowledge management.

About Scaleia

  • The first IT consultancy in Geneva specialized in scalable software and big data analytics

  • Experince in successful, pioneer big data analytics and information management projects at CERN starting from 2006

  • We believe in finding the right and efficient solution to every IT challenge

  • Specialties:

    • Big Data (MapReduce, Hadoop, NoSQL)

    • Distributed Systems

    • Knowledge exchange in organizations (from Q&A forums to custom corporate learning)

    • Open Innovation

    • Search (Lucene, Solr)

    • Algorithmics, System design & performance optimization

  • Our solutions are based on standard Open Source software which prevents "vendor lock-in" and allows massive cost reduction

  • Jan Iwaszkiewicz's photo Founded by Jan Iwaszkiewicz View Jan Iwaszkiewicz's profile on LinkedIn

  • Our clients include: Faveeo

  • We are open to cooperation with:

    • Software providers

    • Consultancies

    • Service Integrators

    • Experts, Developers, Students

Using our experience and open source software, a custom proof-of-concept scalable infrastructure can be created in a matter of days.

Solutions

Open Source solutions for scalable data architectures

Hadoop

The most popular big data storage and processing framework.
The Apache Hadoop® project develops open-source software for reliable, scalable, distributed computing.
The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Rather than rely on hardware to deliver high-avaiability, the library itself is designed to detect and handle failures at the application layer, so delivering a highly-availabile service on top of a cluster of computers, each of which may be prone to failures.

Solr

Solr is the popular, blazing fast open source enterprise search platform from the Apache LuceneTM project. Its major features include powerful full-text search, hit highlighting, faceted search, near real-time indexing, dynamic clustering, database integration, rich document (e.g., Word, PDF) handling, and geospatial search. Solr is highly reliable, scalable and fault tolerant, providing distributed indexing, replication and load-balanced querying, automated failover and recovery, centralized configuration and more. Solr powers the search and navigation features of many of the world's largest internet sites.

Mahout

The Apache Mahout's goal is to build scalable machine learning libraries. Scalable to reasonably large data sets. Mahout's core algorithms for clustering, classfication and batch based collaborative filtering are implemented on top of Apache Hadoop using the map/reduce paradigm.

Cassandra

The Apache Cassandra database is the right choice when you need scalability and high availability without compromising performance. Linear scalability and proven fault-tolerance on commodity hardware or cloud infrastructure make it the perfect platform for mission-critical data. Cassandra's support for replicating across multiple datacenters is best-in-class, providing lower latency for your users and the peace of mind of knowing that you can survive regional outages.

CloudSigma

CloudSigma is an innovative Infrastructure-as-a-Service (IaaS) provider based in Zurich, Switzerland. High availability, flexible cloud servers and cloud hosting in both Europe and the US.

Contact Us

Contact