Passionate engineer with 8+ years of experience in building robust, scalable and secure web applications.
I have a Masters in computer science from Indiana university and my interests are in the area of web development & computer security
Big Data
Project Overview
The project was to understand the progression of international relations between the United States and the rest of the world over the years by performing sentiment analysis on The New York Times articles dataset. A sub objective was to verify the reliability of this sentiment by performing a sentiment analysis on various businesses and validating the generated sentiment against their stock performance over a given period of time.
Data sources: New York Times dataset retrieved from the New york times Articles API. The retrieved articles were grouped based on geographical locations relevant to the article and historical stock data was obtained from the Yahoo finance API
The project utilised MongoDB as the backend database.Hadoop was utilized to pick up news articles as well as stock data and perform distributed processing of the data from the backend in conjunction with the Stanford CoreNLP that was used to perform sentiment analysis on the articles/stock data. R was utilized to perform data visualization.
The project involved ingesting large volumes of twitter data, designing data models and evaluating query performance on the large datasets against MongoDB.
Data ingestion: This involved cleansing the raw data and to make it suitable for storage with all necessary information
Data model design: This involved utilizing different models like reference, embedded or hybrid models to structure the data before storage
Query evaluation: This involved performing a set of standard queries against the two data models in MongoDB and analyzing the performance in retrieving the required results
Data sources: The twitter dataset used was downloaded from University of Illinois August 2014. It is a subset of Twitter data containing 284 million of the following relationships: 3 million user profiles and 50 million tweets. The dataset was collected at May 2011
The project utilized MongoDB as the backend database.Mongoengine framework was used as document-object mapper to work with MongoDB and python.
Project Github link:
Programming languages: Python
Framework: Mongoengine
Database: MongoDB
Python Malware
Project Overview
Python malware is a malware that infects other python files by appending malicious code to them. It monitors and track user activities, affects system memory. The malware embeds malicious scripts in the affected python files, makes registry entries in the user system and affects the system functionality like disabling the task manager, disabling user access control and executing the malware program on startup.
Front end development for a home appliance reseller.The client wanted a redesign of the existing website to a web site that supports full fledged e-commerce functionality.
Programming languages: HTML, CSS, Javascript
Frameworks & Libraries: Bootstrap, jQuery
Project Github link:
Snapshot of the landing page
New Protocol to secure AODV in Mobile AdHoc Networks
Project Overview
A game theoretic approach called The New Protocol and integration of this into the reactive Ad hoc On-demand Distance Vector (AODV) routing protocol to provide defense against blackhole attacks. This idea is based on the concept of non-cooperative game theory. The AODV-NEW outperforms AODV in terms of the number of dropped packets when blackhole nodes exist within a MANET (Mobile AdHoc Network).
Programming languages: C/C++,TCL scripting
Tools & libraries: NS-2 Simulator
MANET (Mobile AdHoc Network) where blackhole nodes damage the routing function by dropping packets. M1 and M2 are the malicious nodes, S is the source and D is the destination
Tutor Me
Project Overview
Developed a peer tutor Search and Scheduling system that provides services from searching, scheduling, and rating peer tutors available in the university. The system will suggest a list of tutors to a student, based on his academic and personal requirements. The application maintains a profile of the peer tutors,as well as students inorder to perform mapping of the students to the appropriate tutors.It also performs scheduling between tutors and students,so that they are no issues with respect to room allocation and conflicting appointments.
1) Combiner code plugin to the word count program, the combiner reduces the amount of data transferred from the map function to the reduce function in a hadoop program.
2) Analysis of RunnerMap.java in the Hadoop-Blast and performing code plugin to copy the distributed cache and assigned FASTA file to local and then run the BLAST binary with correct parameters.The Hadoop-Blast is an advanced Hadoop program that helps BLAST, a bioinformatics application, to utilize the computing capability of Hadoop.
3) Analysis of PageRankMap.java and PageRankReduce.java and take the transformed adjacency matrix and calculate page rank for all pages.
4) Utilization of HBase to load data directly for word count program, instead of using Hadoop distributed file system.
5) HBase FreqIndexBuilder program to build an inverted index table,that has the unique term’s occurrences in all documents from the clueWeb09 dataset.Hbase FreqIndexBuilder is an advanced WordCount program which counts the number of occurrences of each word in a given text input dataset and also stores the related document name (identification number) as HBase inverted index records. These Inverted indices for text data are built for supporting efficient searches in a huge set of text data.
6) Analysis of KMeansMapTask.java and KMeansReduceTask.java. The project required to plugin code in these two java files inorder to implement a parallel version of Kmeans Clustering application using the programming interfaces of the Twister MapReduce framework.
7) Utilization of the HBase-inverted index built in project 5 and page rank algorithm in project 3 to build a search engine.
Programming language: Java
Frameworks:Apache Hadoop, Map reduce, Twister
Online course on Educational Assessment
Project Overview
1) Development of course participant dashboard to display student status and module completion
2) Notifications based on comments and other course activity
3) Configuration of announcements from instructors
4) Development of features to issue digital badges based on student performance
5) Enabling course participants archive thier course activity, so that future course participants view their course modules
6) Integrated awarding of digital badges into the Open EdX Online Education platform using Mozilla’s badge kit API
7) Google analytics,code optimizations and defect resolution
* Future's advanced support details link: Click here
2) Development of simple file system with the aim to provide functionality like create,open,close,read and write.The UNIX file system was used as reference and this project provided good understanding of the inode table structure,other system tables to handle files and buffers to cache files & metadata.
* Xinu filesystem implementation details link: Click here
3) Development of unix custom shell, to execute commands like long listing of files,input and output re-direction,pipes,handling of background processes.
4) Implementation of producer consumer process with sychronization.