Skip to main content

Technologies Used In Developing Applications Using Apache Accumulo

Technologies Used In Developing Applications Using Apache Accumulo


I was recently talking about how people train themselves for Big Data projects. The technology stack is fairly daunting. Below are the technologies that I find helpful. Ill add the list as I remember more:

Systems Administration Technologies

System administrators are absolutely essential to successful projects. They ensure that software is installed and configured correctly. More importantly, they ensure repeatable builds and deployments. Oh, and they really need to understand the widely varying failure modes. Read the excellently written Aphyr blog to learn about some of them.

OpenStack - The technology consists of a series of interrelated projects that control pools of processing, storage, and networking resources throughout a datacenter, all managed through a dashboard that gives administrators control while empowering its users to provision resources through a web interface.

Puppet - Puppet Open Source is a flexible, customizable framework available under the Apache 2.0 license designed to help system administrators automate the many repetitive tasks they regularly perform.

Bash - The command-line for Unix-based operating systems. While you learn about the Bash shell, make sure you also become proficient in Perl, Ruby, or Python scripting language.
Jenkins - An application that monitors executions of repeated jobs, such as building a software project or jobs run by cron.

Ganglia -  (optional) A scalable distributed monitoring system for high-performance computing systems such as clusters and Grids.

Gitorious - (optional) A infrastructure for hosting open source projects that use Git. It can be used inside your firewalls to provide secure access to git repositories.

Jira - (optional) Issue Tracker software. If youre using the Agile methodology, make sure to get the Jira Agile version.

Application Developer Technologies

Vim - While you may prefer to use a nice graphical IDE like NetBeans, Eclipse or IntelliJ, youll be totally lost if you dont understand Vim. It can easily open large files that utterly crush any IDE.

Java - The programming language for Accumulo. Actually you can probably use most JVM-based languages.

Ant - While knowledge if Ant is not required, I use it to run both Java and Map-Reduce jobs. Its ability to orchestrate multiple targets can prove valuable.

Git - A distributed version control system designed to handle everything from small to very large projects with speed and efficiency. This is the version control system used by Accumulo. It seems fairly safe to say that you need at least a basic knowledge of Git to excel.

Maven - A software project management and comprehension tool. Based on the concept of a project object model (POM), Maven can manage a projects build, reporting and documentation from a central piece of information. Apache Accumulo uses Maven to compile, test and build jar files.

Tomcat or Jetty - (optional) Web applications are the main way to interact with users and these two web servers are good to develop with.

Hadoop - The Hadoop project develops open-source software for reliable, scalable, distributed computing. There are several distributions of the Hadoop stack (MapR, Cloudera, etc...) that you can use.

Zookeeper - A centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services. All of these kinds of services are used in some form or another by distributed applications. Accumulo depends on Zookeeper.
Accumulo - a sorted, distributed key/value store is a robust, scalable, high performance data storage and retrieval system.

Jenkins - An application that monitors executions of repeated jobs, such as building a software project or jobs run by cron.

Solr  - includes powerful full-text search, hit highlighting, faceted search, near real-time indexing, dynamic clustering, database integration, rich document (e.g., Word, PDF) handling, and geo-spatial search. Solr is extremely useful when integrating between Big Data and applications. You can analyze the heck out of your data and then store the results in Solr.

Gitorious - (optional) A infrastructure for hosting open source projects that use Git. It can be used inside your firewalls to provide secure access to git repositories.

Jira - (optional) Issue Tracker software. If youre using the Agile methodology, make sure to get the Jira Agile version.

Gnuplot - (optional) plotting software

Graphviz - (optional) diagramming software




download file now

Popular posts from this blog

Tokyo Ghoul Season 1 BD Subtitle Indonesia

Tokyo Ghoul Season 1 BD Subtitle Indonesia Tokyo Ghoul Season 1 BD Subtitle Indonesia Type: BD Series Episode: 12 Status: Completed Genres: Action, Mystery, Drama, Horror, Supernatural, Psychological, Seinen Skor : 8.10 (http://myanimelist.net/anime/22319/Tokyo_Ghoul) Tahun Rilis : 2014 Subtitle : Indonesia Credit : http://anime-bd.com/ Deskripsi: Ketegangan horor yang berada di kota Tokyo dihantui oleh hantu misterius yang memakan manusia. Orang-orang dicekam rasa takut hantu ini yang identitasnya disembunyikan. Seorang mahasiswa biasa bernama Kaneki bertemu ceweknya yang bernama Rize, seorang gadis yang merupakan pembaca  novel seperti dia, di kafe dia sering berjumpa denganya. Perlahan dia kemudian menyadari bahwa nasibnya akan berubah dalam semalam setelah bertemu Rize. Link download 720p & 480p: ===================================================== ========= Resolusi 720p: Tusfiles: Ani-BD_Tokyo_Ghoul_BD_ED01_animesave.mkv � 11.5 MB Ani-BD_Tokyo_Ghoul_BD_ED02_animesave.m...

TOP TEN MOST POPULAR UNIVERSITY in Indonesia

TOP TEN MOST POPULAR UNIVERSITY in Indonesia Top ten most popular University in Indonesia based on METRO TV. the top ten most favourite University in Indonesia were taken from MetroTv. ITB Bandung => Institut Teknologi Bandung is the most favourite university which is elected by the most voters Universitas Indonesia => claimed to be the most advanced university in Indonesia.it has lots of Network as so many people graduated from UI. UGM => Universitas Gajah mada (UGM) is the oldest University (after the declaration of independence).it has 18 Fakulty (and 1 PascaSarjana).this university was the merger from others. Institut Teknologi Sepuluh November => built in 1957,ITS or Institut Teknologi Sepuluh Nopember now specialized in the technology about boat and Information Technology. Universitas Padjajaran (Unpad) => I got no Information about this university and Im sleepy atm. hehehe... Universitas Brawijaya => often called as UB / UNIBRAW, this univer...

Tarzan

Tarzan Tarzan Action Game (PC/Full/Eng) Game Platforme(s) : PC | Language : English | Release Date : Feb 16, 1999 Publisher & Developer : Disney Interactive | Genre : Action/Adventure | Size : 38 Mb Tarzan Action GameSummer. The days are hot and long, and its time for bar-b-ques, camping, and baseball. Summer doesnt just happen outside, though. Its also blockbuster movie season, when theaters screen entertainment thats typically a little lighter in content and more technically dazzling than the award-oriented films shown in the colder, darker, shorter days of fall and winter. Accordingly, summer big-screen fun has come to belong to Disney, which releases an animated feature every year to coincide with the conditions that make it easy to spend an afternoon in an air conditioned movie house, often time and time again to see the same feature over and over. If youve spent any time near fast-food franchises or a department store, then you know from all the merchandise that this years ...