How to Get the Best out of Python and RapidMiner
In this blog post you will learn how to integrate Python machine learning algorithms into RapidMiner and make them look native. Having a open system like it is of course possible to extend the system...
View ArticleHow to Accelerate your Data Analysis Using Local MapReduce
Today i would like to share a unique method to parallelize RapidMiner processes with you. You can design arbitrary work flows which then run either on your laptop or your server using the paradigm...
View ArticleThis Is Data Science
In this post I will help you to get a quick overview of data science. The full complexity of algorithms can be reduced down into six easy understandable groups. With this knowledge you are fully...
View ArticleThis Is The Most Important Technique in Data Science
Predictive analytics is all about finding general rules. The key thing in doing that is to be sure that those general rules are rules and that they are general. Validation is all about proving that....
View ArticleWhy You Should Ignore Data Science Performance Measures
Last week I was visiting a customer presenting my analysis. I was showing an AUC of 0.92. Impressive, uhm? You don't know what I am talking about? Great - you don't need to. There are a bunch of data...
View ArticleThis Simple Question Can Save You Millions of Dollars in Data Science
Data Science projects can be difficult to understand and it is even harder to judge if the analysis was done carefully or simply quick and dirty - and thus not realiable. One simple way to get a quick...
View ArticleThis is How RapidMiner Professionals Get Maximum Productivity
RapidMiner is a very versatile tool and has different options to customize it to your needs. I have seen quite some people with different setups. So in order to get some interesting ideas for me and...
View ArticleThis is How You Analyse your LinkedIn Connections
Have you ever been interested which share of your connections are female or male? In this short guide you will learn how to figure this out. Get the Data First we go to LinkedIn's export page and...
View ArticleThe Art of Data Science
In one of my last posts I focused on what data science is at it's core. I got a lot of positive feedback on the simple six box approach. One comment made me think a bit more about the question: What...
View ArticleHow To Implement a New Learner for RapidMiner
I was eager to develop an operator for RapidMiner for a long time. I already implemented an operator once - but it was rather useless. After RapidMiner released the new how to develop extensions guide...
View ArticleQuick Tip: How To Get Mails from GMail in RapidMiner
I recently tried to get emails from gmail into RapidMiner. There are two operators in RM able to read via pop3/imap/jdni: Read Documents (Mail) and Process Documents from Mail Store. If you try to use...
View ArticleWhy it is All About Being Productive
Quite often I get into discussions with coders about the superiority of code based solution (usually Python, sometimes R) over a visual solution for data analytics. I am personally convinced that it...
View ArticleQuick Tip: Get Real Random Seeds in RapidMiner
Usually you want to have reproducablilty in a program. There are rare cases where you do not want to have a deterministic random number. From time to time this happens. In my case this happens if i...
View Article7 Things You Need To Do After Installing RapidMiner 7
You probably all know - RapidMiner 7 is out. I would like to take this chance to give you a list of things you need to do. Read The Docs This might be one of the boring sounding things; I can not...
View ArticleValue Generation in Data Analysis
Most companies are in a process of embracing data as a tatical and often stratigical part of their business. Being part of the industry and part of the data science community I asked myself if it is...
View ArticleRapidMiner Quick Tip: Generate Weight
I just wanted add another quick tip to my blog. About half a year a go I found - with some help - the operator Generate Weight (Stratification). This is a pretty useful operator in case of imbalanced...
View ArticleWhere is the German Industry?
I asked myself recently: Where is the german industry located? From my personal feeling a lot of german industry is located south in Bavaria or Baden-Würtemberg. But nevertheless I wanted to check this...
View ArticleTake Care of Your Weights
Some time ago I wrote an article on how to generate example weights in RapidMiner. Example weights make single lines in your data set more important than others. This technique is used in two...
View ArticleBuilding A Data Cleaner in RapidMiner
I've come across Randal Olson's data cleaner in Python. This tool is designed to: datacleaner is not magic, and it won't take an unorganized blob of text and automagically parse it out for you. What...
View ArticleEnsure Class Mapping in RapidMiner
Just a quick one for all the RapidMiner users. In RapidMiner Nominal types are internally mapped to integers. In most cases you do not care about this mapping because it does not bother you. On one...
View Article
More Pages to Explore .....