Tuesday: New Home Sales

By | ai, bigdata, machinelearning


From Matthew Graham at Mortgage News Daily: Low and Sideways, Mortgage Rates Play Waiting Game

Mortgage rates were slightly higher for the 3rd straight day, continuing a modest bounce back from the year’s lowest rates last Wednesday. …

While the general movement in rates has been slightly higher, it hasn’t lifted rates much above 2017’s lows. Especially when considered next to anything before last Wednesday, recent rate offerings have been low and the trend has been sideways. Most lenders continue to offer conventional 30yr fixed rates of 4.0% on top tier scenarios. The only difference from Friday would be marginally higher upfront costs, but several lenders are effectively “unchanged.”
emphasis added

Tuesday:
• At 8:30 AM ET, New Home Sales for April from the Census Bureau. The consensus is for a decrease in sales to 604 thousand Seasonally Adjusted Annual Rate (SAAR) in April from 621 thousand in March.

• Also at 10:00 AM, Richmond Fed Survey of Manufacturing Activity for May.


Source link

7 Things to Know About Performance Marketing

By | ai, bigdata, machinelearning

You’ve probably heard a lot about performance marketing, but what really sets it apart from any other kind of marketing? We spoke with Nir Elharar, author of this great post about getting started with performance marketing, about what puts the performance in performance marketing.

1.It’s based on optimizing for down-funnel KPIs

With other kinds of marketing, like display advertising or awareness marketing, your KPIs (Key Performance Indicators) might be things like impressions or clicks, but performance marketers are focused on actions: leads, sign-ups, conversions, and sales.

2.You only pay for what you get

Instead of paying up front like with a display ad, in performance marketing you pay per desirable action. If a campaign isn’t succeeding, then it’s not costing you as much. Which is why performance marketers are always optimizing, tinkering, and trying new things.

3.The cadence of optimization is much faster

Because performance marketing is based entirely on measurable results, the optimization happens much faster. Performance marketers don’t have three, four, or five weeks to get it right. They have to hone in on the right audience quickly. It’s much more hands-on, with a constant feedback loop of data: pouring more resources into the strategies that are working and pulling them quickly from strategies that aren’t.

4.It gives you real-time measurement of ROI

With other kinds of advertising and marketing, measuring results can be difficult. How do you know if you’ve raised someone’s brand awareness? How do you measure good feelings about your brand? There are ways, of course, but they usually happen after the fact and aren’t necessarily hard data. With performance marketing, you can see a snapshot of your ROI at any given instant: the cost per lead, per sign-up, per sale.

5.The risks are lower for advertisers

Since you only spend for what you get, you don’t have to sell anybody in your business on taking a risk and putting forward a big chunk of a budget for an ad. That also means that campaigns can be launched more quickly, because there aren’t laborious approval processes. As Elharar says, “In most traditional forms of advertising, the advertiser pays a fee up front for ad space independent of performance. That could mean hundreds to thousands of dollars spent without ever seeing a conversion. With performance marketing, advertisers only pay for successful transactions.”

6.It works with small budgets and large ones

Getting started with performance marketing doesn’t take a huge budget, which makes it more accessible to smaller organizations or brands just starting to figure out their marketing strategy. Seeing results is a great way to convince stakeholders to get on board, and with performance marketing, you see results as you invest your marketing budget. It’s also easy to adjust: ramping up a campaign can be simple and quick.

7.It’s an innovative space

Performance marketers are always looking for new ways to reach “the right audience at the right time,” as Elharar says. In today’s quickly-changing landscape, that often means performance marketers are the first ones trying and optimizing marketing for new social platforms, mobile-first applications, chatbots, and more.

The post 7 Things to Know About Performance Marketing appeared first on outbrain.com.


Source link

Quantile definitions in SAS

By | ai, bigdata, machinelearning

(This article was originally published at The DO Loop, and syndicated at StatsBlogs.)

In last week’s

article about the Flint water crisis
, I computed the 90th percentile of a small data set. Although I didn’t mention it, the value that I reported is different from the the 90th percentile that is reported in Significance magazine.

That is not unusual. The data only had 20 unique values, and there are many different formulas that you can use to compute sample percentiles (generally called quantiles). Because different software packages use different default formulas for sample quantiles, it is not uncommon for researchers to report different quantiles for small data sets. This article discusses the five percentile definitions that are supported in SAS software.

You might wonder why there are multiple definitions. Recall that
a sample quantile is an estimate of a population quantile. Statisticians have proposed many quantile estimators, some of which are based on the empirical cumulative distribution (ECDF) of the sample, which approximates the cumulative distribution function (CDF) for the population.
The ECDF is a step function that has a jump discontinuity at each unique data value. Consequently, the inverse ECDF does not exist and the quantiles are not uniquely defined.

Definitions of sample quantiles

In SAS, you can use the PCTLDEF= option in PROC UNIVARIATE or the QNTLDEF= option in other procedures to control the method used to estimate quantiles. A sample quantile does not have to be an observed data value because you are trying to estimate an unknown population parameter.

For convenience, assume that the sample data are listed in sorted order.
In high school, you probably learned that if a sorted sample has an even number of observations, then the median value is the average of the middle observations. The default quantile definition in SAS (QNTLDEF=5) extends this familiar rule to other quantiles. Specifically, if the sample size is N and you ask for the q_th quantile, then when Nq is an integer,
the quantile is the data value x[Nq]. However, when Nq is not an integer, then the quantile is defined (somewhat arbitrarily) as the average of the two data x[j] and x[j+1], where j = floor(Nq). For example, if N=10 and you want the q=0.17 quantile, then Nq=1.7, so j=1 and the 17th percentile is reported as the midpoint between the ordered values x[1] and x[2].

Averaging is not the only choices you can make when Nq is not an integer. The other percentile definitions correspond to making different choices. For example, you could round Nq down (QNTLDEF=3), or you could round it to the nearest integer (QNTLDEF=2).
Or you could use linear interpolation (QNTLDEF=1 and QNTLDEF=4) between the data values whose (sorted) indices are closest to Nq. In the example where N=10 and q=0.17, the QNTLDEF=1 interpolated quantile is 0.3 x[1] + 0.7 x[2].

Visualizing the definitions for quantiles

The SAS documentation contains the formulas used for the five percentile definitions, but sometimes a visual comparison is easier than slogging through mathematical equations.
The differences between the definitions are most apparent on small data sets that contain integer values, so let’s create a tiny data set and apply the five definitions to it.
The following example has 10 observations and six unique values.

data Q;
input x @@;
datalines;
0 1 1 1 2 2 2 4 5 8
;

ECDF of a small data set

You can use PROC UNIVARIATE or other methods to plot the empirical cumulative proportions, as shown. Because the ECDF is a step function, most cumulative proportions values (such as 0.45) are “in a gap.” By this I mean that there is no observation t in the data for which the cumulative proportion P(X ≤ t) equals 0.45. Depending on how you define the sample quantiles, the 0.45 quantile might be reported as 1, 1.5, 1.95, or 2.

Since the default definition is QNTLDEF=5, let’s visualize the sample quantiles for that definition. You can use the PCTPTS= option on the OUTPUT statement in PROC UNIVARIATE to declare the percentiles that you want to compute. Equivalently, you can use the QNTL function in PROC IML, as below. Regardless, you can ask SAS to find the quantiles for a set of probabilities on a fine grid of points such as {0.001, 0.002, …, 0.998, 0.999}. You can the graph of the probabilities versus the quantiles to visualize how the percentile definition computes quantiles for the sample data.

proc iml;
use Q; read all var "x"; close;       /* read data */
prob = T(1:999) / 1000;               /* fine grid of prob values */
call qntl(quantile, x, prob, 5);      /* use method=5 definition  */
create Pctls var {"quantile" "prob" "x"}; append; close;
quit;
 
title "Sample Percentiles";
title2 "QNTLDEF = 5";
proc sgplot data=Pctls noautolegend;
   scatter x=quantile y=prob / markerattrs=(size=5 symbol=CircleFilled);
   fringe x / lineattrs=GraphData2(thickness=3);
   xaxis display=(nolabel) values=(0 to 8);
   yaxis offsetmin=0.05 grid values=(0 to 1 by 0.1) label="Cumulative Proportions";
   refline 0 1 / axis=y;
run;

Sample quantiles (percentiles) for a small data set

For each probability value (Y axis), the graph shows the corresponding sample quantile (X axis) for the default definition in SAS, which is QNTLDEF=5. The X axis also displays red tick marks at the location of the data. You can use this graph to find any quantile. For example, to find the 0.45 quantile, you start at 0.45 on the Y axis, move to the right until you hit a blue marker, and then drop down to the X axis to discover that the 0.45 quantile estimate is 2.

If you prefer to think of the quantiles (the X values) as a function of the probabilities, just interchange the X= and Y= arguments in the SCATTER statement (or turn your head sideways!). Then the quantile function is a step function.

Comparing all five SAS percentile definitions

It is easy to put a loop around the SAS/IML computation to compute the sample quantiles for the five different definitions that are supported in SAS. The following SAS/IML program writes a data set that contains the sample quantiles. You can use the WHERE statement in PROC PRINT to compare the same quantile across the different definitions. For example, the following displays the 0.45 quantile (45th percentile) for the five definitions:

/* Compare all SAS methods */
proc iml;
use Q; read all var "x"; close;       /* read data */
prob = T(1:999) / 1000;               /* fine grid of prob values */
create Pctls var {"Qntldef" "quantile" "prob" "x"};
do def = 1 to 5;
   call qntl(quantile, x, prob, def); /* qntldef=1,2,3,4,5 */
   Qntldef = j(nrow(prob), 1, def);   /* ID variable */
   append;
end;
close;
quit;
 
proc print data=Pctls noobs;
   where prob = 0.45;                 /* compare 0.45 quantile for different definitions */
   var Qntldef quantile;
run;

You can see that the different definitions lead to different sample quantiles. How do the quantile functions compare? Let’s plot them and see:

ods graphics / antialiasmax=10000;
title "Sample Percentiles in SAS";
proc sgpanel data=Pctls noautolegend;
   panelby Qntldef / onepanel rows=2;
   scatter x=quantile y=prob/ markerattrs=(size=3 symbol=CircleFilled);
   fringe x;
   rowaxis offsetmax=0 offsetmin=0.05 grid values=(0 to 1 by 0.1) label="Cumulative Proportion";
   refline 0 1 / axis=y;
   colaxis display=(nolabel);
run;

Compare percentile definitions in SAS

The graphs (click to enlarge) show that QNTLDEF=1 and QNTLDEF=4 are piecewise-linear interpolation methods, whereas QNTLDEF=2, 3, and 5 are discrete rounding methods. The default method (QNTLDEF=5) is similar to QNTLDEF=2 except for certain averaged values. For the discrete definitions, SAS returns either a data value or the average of adjacent data values. The interpolation methods do not have that property: the methods will return quantile values that can be any value between observed data values.

If you have a small data set, as in this blog post, it is easy to see how the percentile definitions are different. For larger data sets (say, 100 or more unique values), the five quantile functions look quite similar.

The differences between definitions are most apparent when there are large gaps between adjacent data values. For example, the sample data has a large gap between the ninth and tenth observations, which have the values 5 and 8, respectively.
If you compute the 0.901 quantile, you will discover that the “round down” method (QNTLDEF=2) gives 5 as the sample quantile, whereas the “round up” method (QNTLDEF=3) gives the value 8. Similarly, the “backward interpolation method” (QNTLDEF=1) gives 5.03, whereas the “forward interpolation method” (QNTLDEF=4) gives 7.733.

In summary, this article shows how the (somewhat obscure) QNTLDEF= option results in different quantile estimates. Most people just accept the default definition (QNTLDEF=5), but if you are looking for a method that interpolates between data values, rather than a method that rounds and averages, I recommend QNTLDEF=1, which performs linear interpolations of the ECDF. The differences between the definitions are most apparent for small samples and when there are large gaps between adjacent data values.

Reference

For more information about sample quantiles, including a mathematical discussion of the various formulas, see
Hyndman, R. J. and Fan, Y. (1996) “Sample quantiles in statistical packages”, American Statistician, 50, 361–365.

The post Quantile definitions in SAS appeared first on The DO Loop.

Please comment on the article here: The DO Loop

The post Quantile definitions in SAS appeared first on All About Statistics.




Source link

Preview of EARL San Francisco

By | ai, bigdata, machinelearning

The first ever EARL (Enterprise Applications of the R Language) conference in San Francisco will take place on June 5-7 (and it's not too late to register).  The EARL conference series is now in its fourth year, and the prior conferences in London and Boston have each been a fantastic way to learn how R is used in real-world applications. Judging from the speaker lineup in San Francisco, next month's event looks like it will feature some great R stores as well. Included on the program will be stories from the following companies:

  • AirBnB (Keynote presentation by Ricardo Bion)
  • StitchFix (Keynote presentation by Hilary Parker)
  • Uber (Keynote presentation by Prakhar Mehrota)
  • Hitachi Solutions (David Bishop)
  • Amgen (Daniel Saldana)
  • Pfizer (Luke Fostveldt)
  • Fred Hutch Cancer Research Center (Thomas Vaughan)
  • Domino Data Lab (Eduardo  Ariño de la Rubia)
  • Genentech (Gabriel Becker)

… and many more. On a personal note, I'm looking forward to presenting with my colleague Bharath Sankaranarayan on using Microsoft R to predict hospital stays.

For more information on EARL 2017 San Francisco or to register, follow the link below. (And if you can't make San Francisco, check out EARL 2017 London, coming September 12-14.)

EARL: San Francisco 2017

 


Source link

The Marcos Lopez de Prado Hierarchical Risk Parity Algorithm

By | ai, bigdata, machinelearning

(This article was first published on R – QuantStrat TradeR, and kindly contributed to R-bloggers)

This post will be about replicating the Marcos Lopez de Prado algorithm from his paper building diversified portfolios that outperform out of sample. This algorithm is one that attempts to make a tradeoff between the classic mean-variance optimization algorithm that takes into account a covariance structure, but is unstable, and an inverse volatility algorithm that ignores covariance, but is more stable.

This is a paper that I struggled with until I ran the code in Python (I have anaconda installed but have trouble installing some packages such as keras because I’m on windows…would love to have someone walk me through setting up a Linux dual-boot), as I assumed that the clustering algorithm actually was able to concretely group every asset into a particular cluster (I.E. ETF 1 would be in cluster 1, ETF 2 in cluster 3, etc.). Turns out, that isn’t at all the case.

Here’s how the algorithm actually works.

First off, it computes a covariance and correlation matrix (created from simulated data in Marcos’s paper). Next, it uses a hierarchical clustering algorithm on a distance-transformed correlation matrix, with the “single” method (I.E. friend of friends–do ?hclust in R to read up more on this). The key output here is the order of the assets from the clustering algorithm. Note well: this is the only relevant artifact of the entire clustering algorithm.

Using this order, it then uses an algorithm that does the following:

Initialize a vector of weighs equal to 1 for each asset.

Then, run the following recursive algorithm:

1) Break the order vector up into two equal-length (or as close to equal length) lists as possible.

2) For each half of the list, compute the inverse variance weights (that is, just the diagonal) of the covariance matrix slice containing the assets of interest, and then compute the variance of the cluster when multiplied by the weights (I.E. w’ * S^2 * w).

3) Then, do a basic inverse-variance weight for the two clusters. Call the weight of cluster 0 alpha = 1-cluster_variance_0/(cluster_variance_0 + cluster_variance_1), and the weight of cluster 1 its complement. (1 – alpha).

4) Multiply all assets in the original vector of weights containing assets in cluster 0 with the weight of cluster 0, and all weights containing assets in cluster 1 with the weight of cluster 1. That is, weights[index_assets_cluster_0] *= alpha, weights[index_assets_cluster_1] *= 1-alpha.

5) Lastly, if the list isn’t of length 1 (that is, not a single asset), repeat this entire process until every asset is its own cluster.

Here is the implementation in R code.

First off, the correlation matrix and the covariance matrix for use in this code, obtained from Marcos Lopez De Prado’s code in the appendix in his paper.

> covMat
             V1           V2           V3           V4           V5          V6           V7           V8           V9          V10
1   1.000647799 -0.003050479  0.010033224 -0.010759689 -0.005036503 0.008762563  0.998201625 -0.001393196 -0.001254522 -0.009365991
2  -0.003050479  1.009021349  0.008613817  0.007334478 -0.009492688 0.013031817 -0.009420720 -0.015346223  1.010520047  1.013334849
3   0.010033224  0.008613817  1.000739363 -0.000637885  0.001783293 1.001574768  0.006385368  0.001922316  0.012902050  0.007997935
4  -0.010759689  0.007334478 -0.000637885  1.011854725  0.005759976 0.000905812 -0.011912269  0.000461894  0.012572661  0.009621670
5  -0.005036503 -0.009492688  0.001783293  0.005759976  1.005835878 0.005606343 -0.009643250  1.008567427 -0.006183035 -0.007942770
6   0.008762563  0.013031817  1.001574768  0.000905812  0.005606343 1.064309825  0.004413960  0.005780148  0.017185396  0.011601336
7   0.998201625 -0.009420720  0.006385368 -0.011912269 -0.009643250 0.004413960  1.058172027 -0.006755374 -0.008099181 -0.016240271
8  -0.001393196 -0.015346223  0.001922316  0.000461894  1.008567427 0.005780148 -0.006755374  1.074833155 -0.011903469 -0.013738378
9  -0.001254522  1.010520047  0.012902050  0.012572661 -0.006183035 0.017185396 -0.008099181 -0.011903469  1.075346677  1.015220126
10 -0.009365991  1.013334849  0.007997935  0.009621670 -0.007942770 0.011601336 -0.016240271 -0.013738378  1.015220126  1.078586686
> corMat
             V1           V2           V3           V4           V5          V6           V7           V8           V9          V10
1   1.000000000 -0.003035829  0.010026270 -0.010693011 -0.005020245 0.008490954  0.970062043 -0.001343386 -0.001209382 -0.009015412
2  -0.003035829  1.000000000  0.008572055  0.007258718 -0.009422702 0.012575370 -0.009117080 -0.014736040  0.970108941  0.971348946
3   0.010026270  0.008572055  1.000000000 -0.000633903  0.001777455 0.970485047  0.006205079  0.001853505  0.012437239  0.007698212
4  -0.010693011  0.007258718 -0.000633903  1.000000000  0.005709500 0.000872861 -0.011512172  0.000442908  0.012052964  0.009210090
5  -0.005020245 -0.009422702  0.001777455  0.005709500  1.000000000 0.005418538 -0.009347204  0.969998023 -0.005945165 -0.007625721
6   0.008490954  0.012575370  0.970485047  0.000872861  0.005418538 1.000000000  0.004159261  0.005404237  0.016063910  0.010827955
7   0.970062043 -0.009117080  0.006205079 -0.011512172 -0.009347204 0.004159261  1.000000000 -0.006334331 -0.007592568 -0.015201540
8  -0.001343386 -0.014736040  0.001853505  0.000442908  0.969998023 0.005404237 -0.006334331  1.000000000 -0.011072068 -0.012759610
9  -0.001209382  0.970108941  0.012437239  0.012052964 -0.005945165 0.016063910 -0.007592568 -0.011072068  1.000000000  0.942667300
10 -0.009015412  0.971348946  0.007698212  0.009210090 -0.007625721 0.010827955 -0.015201540 -0.012759610  0.942667300  1.000000000

Now, for the implementation.

This reads in the two matrices above and gets the clustering order.

covMat <- read.csv('cov.csv', header = FALSE)
corMat <- read.csv('corMat.csv', header = FALSE)

clustOrder <- hclust(dist(corMat), method = 'single')$order

This is the clustering order:

> clustOrder
 [1]  9  2 10  1  7  3  6  4  5  8

Next, the getIVP (get Inverse Variance Portfolio) and getClusterVar functions (note: I’m trying to keep the naming conventions identical to Dr. Lopez’s paper)

getIVP <- function(covMat) {
  invDiag <- 1/diag(as.matrix(covMat))
  weights <- invDiag/sum(invDiag)
  return(weights)
}

getClusterVar <- function(covMat, cItems) {
  covMatSlice <- covMat[cItems, cItems]
  weights <- getIVP(covMatSlice)
  cVar <- t(weights) %*% as.matrix(covMatSlice) %*% weights
  return(cVar)
}

Next, my code diverges from the code in the paper, because I do not use the list comprehension structure, but instead opt for a recursive algorithm, as I find that style to be more readable.

One wrinkle to note is the use of the double arrow dash operator, to assign to a variable outside the scope of the recurFun function. I assign the initial weights vector w in the global environment, and update it from within the recurFun function. I am aware that it is a faux pas to create variables in the global environment, but my attempts at creating a temporary environment in which to update the weight vector did not produce the updating mechanism I had hoped to, so a little bit of assistance with refactoring this code would be appreciated.

getRecBipart <- function(covMat, sortIx) {
  # keeping track of w in the global environment
  assign("w", value = rep(1, ncol(covMat)), envir = .GlobalEnv)
  recurFun(covMat, sortIx)
  return(w)
}

recurFun <- function(covMat, sortIx) {
  subIdx <- 1:trunc(length(sortIx)/2)
  cItems0 <- sortIx[subIdx]
  cItems1 <- sortIx[-subIdx]
  cVar0 <- getClusterVar(covMat, cItems0)
  cVar1 <- getClusterVar(covMat, cItems1)
  alpha <- 1 - cVar0/(cVar0 + cVar1)
  
  # scoping mechanics using w as a free parameter
  w[cItems0] <<- w[cItems0] * alpha
  w[cItems1] < 1) {
    recurFun(covMat, cItems0)
  }
  if(length(cItems1) > 1) {
    recurFun(covMat, cItems1)
  }
}

Lastly, let’s run the function.

out <- getRecBipart(covMat, clustOrder)

With the result (which matches the paper):

> out
 [1] 0.06999366 0.07592151 0.10838948 0.19029104 0.09719887 0.10191545 0.06618868 0.09095933 0.07123881 0.12790318

So, hopefully this democratizes the use of this technology in R. While I have seen a raw Rcpp implementation and one from the Systematic Investor Toolbox, neither of those implementations satisfied me from a “plug and play” perspective. This implementation solves that issue. Anyone here can copy and paste these functions into their environment and immediately make use of one of the algorithms devised by one of the top minds in quantitative finance.

A demonstration in a backtest using this methodology will be forthcoming.

Thanks for reading.

NOTE: I am always interested in networking and full-time opportunities which may benefit from my skills. Furthermore, I am also interested in project work in the volatility ETF trading space. My linkedin profile can be found here.

To leave a comment for the author, please follow the link and comment on their blog: R – QuantStrat TradeR.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...




Source link

Mapping Data Sources to Objectives

By | ai, bigdata, machinelearning

As a visible project, The Analyst’s Canvas is new, but it’s been cooking for years. Now comes the fun part: working through the ways to use it. Today, let’s talk about raw material: the data that go into the analytic process that leads eventually to information, insight and action.

Look at the row of three boxes across the middle of the canvas: data / sources, analysis and presentation / delivery. In 2008, I used that basic outline to describe the building blocks of social media analysis. With the boxes empty, we have a framework to summarize many different tools and approaches. In the next release of the canvas, this row will be labeled Description.

The analysts canvas captioned

The big idea of the canvas is to keep analytical work grounded with two of my favorite questions: what are you trying to accomplish, and why? Collectively, the boxes on the Description row characterize an intelligence or analytics process from source data to delivery, whether the finished product is a report, a software tool or something else. The row is below the Objective box as a reminder that the work has to support meaningful objectives.

I suspect that many discussions will take the components of an analytical play as a unit, especially since so many capabilities come packaged as turnkey tools with data, analytics and presentation built in. But whether building a new capability or evaluating an existing one, the three component boxes must be the right choices to support the Objective. It’s not enough that the pieces work together, because we run the risk of developing elegant solutions to the wrong problems.

Using the canvas as a prompt
Every box is the canvas includes a set of basic prompts to initiate the exploration. The Data exploration begins with these:

  • What data/information is required?
  • What sources will provide it?
  • What are the limitations and drawbacks of the chosen sources?
  • Is this the right source, or is it just familiar or available?

Beyond the prompts, we can use the canvas to ask important questions about our preferred data sources:

  • Does this source contain the information needed to support the Objective?
  • Does the information from this source answer the questions we need to address?
  • What other/additional sources might better answer the questions?

Analyzing the canvas
If you look back at the canvas, you’ll see that those questions address the relationship between one box (in this case, Data) and its neighbors (Objective, Questions and Alternatives). Its other neighbor, Analysis, is a special case. Depending on which you consider first, you might ask if the source contains the information needed for your analysis or the analysis is appropriate to the properties of the source.

Source to mission

In the first draft of the Explorer Guide, I included some suggested orders for working through the sections of the canvas for a few scenarios. In this exercise, I’m seeing something different: insights we can gain from the relationships across boundaries within the model. More to come.




Source link

Machine Learning For Dummies

By | iot, machinelearning

Your no-nonsense guide to making sense of machine learning

Machine learning can be a mind-boggling concept for the masses, but those who are in the trenches of computer programming know just how invaluable it is. Without machine learning, fraud detection, web search results, real-time ads on web pages, credit scoring, automation, and email spam filtering wouldn’t be possible, and this is only showcasing just a few of its capabilities. Written by two data science experts, Machine Learning For Dummies offers a much-needed entry point for anyone looking to use machine learning to accomplish practical tasks.

Covering the entry-level topics needed to get you familiar with the basic concepts of machine learning, this guide quickly helps you make sense of the programming languages and tools you need to turn machine learning-based tasks into a reality. Whether you’re maddened by the math behind machine learning, apprehensive about AI, perplexed by preprocessing data—or anything in between—this guide makes it easier to understand and implement machine learning seamlessly.

  • Grasp how day-to-day activities are powered by machine learning
  • Learn to ‘speak’ certain languages, such as Python and R, to teach machines to perform pattern-oriented tasks and data analysis
  • Learn to code in R using R Studio
  • Find out how to code in Python using Anaconda

Dive into this complete beginner’s guide so you are armed with all you need to know about machine learning!For Dummies

$18.81