Skip to main content

Posts

Showing posts from July, 2018

Predicting Cost of Tender with 99.24% Accuracy : Miracle!

Data Science is reaching new levels and so are the models. But reaching a whooping 99.24% accuracy using simple feature engineering and a simple Decision Tree Classifier ? That's new! Hello everyone, today I am going to present you my model which can predict value range of a tender in Seattle Trade Permits with a whooping accuracy of 99.24 % (With some obvious caveats which I will discuss in the end). My Kernel : Yet Another Value Prediction The Prediction Kernel BASIC EDA This time out, I am going to use plotly library in Python. This is literally the best option for interactive plots and if you actually visit the kernel, you will understand why. First of all, we will focus on checking out the Top Grossing Contractors in the Seattle area who have earned the most out of the tender acquisitions. This will lead to this interactive graph: Similarly, one could plot out another graph for Amount earned per project. But another thing which caug

Data Science Libraries to look out for in 2018

Hey Readers,  As Python has gained a lot of traction in the recent years in Data Science industry. I wanted to outline some of its most useful libraries for data scientists and engineers, based on recent experience. NumPy When beginning to manage the scientific undertaking in Python, one unavoidably desires help to Python's SciPy Stack, which is an accumulation of programming particularly intended for scientific processing in Python (don't mistake for SciPy library, which is a piece of this stack, and the network around this stack). Along these lines we need to begin with a glance at it. Be that as it may, the stack is quite huge, there is in excess of twelve of libraries in it, and we need to put a point of convergence on the center bundles (especially the most fundamental ones).  The most major bundle, around which the scientific computation stack is constructed, is NumPy (remains for Numerical Python). It gives a plenitude of valuable highlights for task

Datasets by Microsoft Research now available in the cloud : Microsoft announces open Datasets!

Hey Readers, today I bring forth an exciting news for you all aspiring data scientists and machine learners! Something new happened in Microsoft Research Blog :  The Microsoft Research Outreach team has worked extensively with the external research community to enable adoption of cloud-based research infrastructure over the past few years. Through this process, we experienced the ubiquity of Jim Gray’s fourth paradigm of discovery based on data-intensive science – that is, almost all research projects have a data component to them. This data deluge also demonstrated a clear need for curated and meaningful datasets in the research community, not only in computer science but also in interdisciplinary and domain sciences. Today we are excited to launch  Microsoft Research Open Data  – a new data repository in the cloud dedicated to facilitating collaboration across the global research community. Microsoft Research Open Data, in a single, convenient, cloud-hosted location, offe

Data Science Tip : Why and how to Improve your training data.

Hi readers ,  There are heaps of good reasons why researchers are so focused on model designs, however it means that there are not very many assets accessible to control individuals who are centered around deploying machine learning underway. To address that, An ongoing talk at the gathering was on "the preposterous adequacy of preparing information", and I need to develop that a bit in this blog entry, clarifying why information is so imperative alongside some commonsense tips on enhancing it. As a feature of my investigation I work intimately with a great deal of researchers and item groups, and my faith in the intensity of information changes originates from the gigantic additions I've seen them accomplish when they focus on that side of their model building. The greatest boundary to utilizing deep learning in many applications is getting sufficiently high accuracy in reality, and enhancing the preparation set is the quickest route I've seen to accuracy upgr

Weekly Open Source News -1

Hello everyone! Welcome to the first edition of Open Source Weekly where I bring you to the new and updated Github  repositories which we use now and then. Pandas (Python) The @pandas-dev/pandas  have recently started a new contribution milestone for non organisation members named "Contributions Welcome", which happens to be described as  Changes that would be nice to have in the next release. These issues are not blocking. They will be pushed to the next release if no one has time to fix them.  And as for now, it is about 20% completed. So do look out for new issues here and there if you are interested in making some sexy Data Frame stuff. Atom (JavaScript) The @atom/atom  are facing a really interesting issue right now, Namely :  Atom does not quit on OS X  #17672 Interesting, no? Apparently both shortcut and menu bar exit ain't working for this guy, so incase anyone is interested in helping out: they surely can. The Link for the same is :  At

Predicting App Popularity using its Size : Bad Idea

Hey Folks! I recently came across an excellent hot trending dataset in kaggle. It is named Mobile App Statistics (mainly based upon Apple iStore) and instantly determined to try my hands on this data set. Link to the data set: Apple App Store (7200 datasets) Now, our objectives would be these 3 for the developed kernel : Checking out popular applications in each genre in Apple Store (Basically, grabbing the top charts for our data Analysis)  Checking the trend of an App's  User Experience  with respect to its  cost ,  User Rating count  and  Number of devices and Languages  it supports. Judging a game's popularity by its APK size and make a  Random Forest Classifier  to classify by popularity Of these 3, I will be discussing the first and the third objective here! Before anything else, one may notice that the size in bytes is not really a good standard of measure, so why not convert it into Megabytes by the following code? and apply it using sta

The difference between Artificial Intelligence and Machine Learning

Artificial Intelligence and Machine / Deep Learning, these are the few hotshot words being repeatedly used in the promising time where man no longer needs to do everything on his own. He's getting some rather intelligent help now by the hands of our own helping Machines. From the virtual assistants to website builders, we can see the advent of intelligent automation everywhere around us. But, in the meantime, you need to know where to use what. I give you that AI and Machine Learning are so close in practical applications that differentiating between them is really tough and the words can be actually used interchangeably without noticeable disputes. Despite that, I'd suggest you pick your words carefully and I would tell you the basic difference between Artificial Intelligence (AI), Machine Learning (ML) and Deep Learning (DL). Artificial Intelligence To start with begat in 1956 by John McCarthy, AI includes machines that can perform undertakings that are normal fo

About This Blog

Hello readers, This is your captain on this ship of Artificial Intelligence, hop aboard lads! Allow me to introduce myself. I am Uddeshya Singh, A sophomore at HBTU Kanpur in the field of Computer Science and Engineering. But ... that's just education, what am I apart from that? A Coder, A YouTuber; An Open Source Contributor at pandas, numpy, and cosmos; DSA Intern at OpenGenus Organisation, and of course an Artificial Intelligence enthusiast. Obviously, I have a Github Account which you can follow me on and you can definitely find me on Facebook and whatnot. But that's not the point of this introduction. I want to let you guys know how this blog is going to work. There would be largely only 3 categories; namely coding challenges, tutorial supplements and interesting articles regarding new AI advancements which I will come across (or maybe Open Source developments regarding the few modules which I follow pretty minutely :) ) You can always reach me in

Total Pageviews