Update classes.py in knn #103

Hi, I like your code. It's concise and efficient. But when i read the recommenders part, that's the "class UserBasedRecommender(UserRecommender)", i found the code in the method named estimated_preference can not guarantee that one neighbor's preference will multiple the his similarity rather than others. It is the previous code: prefs = prefs[~np.isnan(prefs)] similarities = similarities[~np.isnan(prefs)] prefs_sim = np.sum(prefs[~np.isnan(similarities)] * similarities[~np.isnan(similarities)]) total_similarity = np.sum(similarities) I take a simple example: >>> import numpy as np >>> p = np.array([np.nan, 3,4,5,np.nan,5,6,np.nan,9,10]) >>> p array([ nan, 3., 4., 5., nan, 5., 6., nan, 9., 10.]) >>> s = np.array([1,np.nan,4,6,np.nan,6,7,8,9,10]) >>> s array([ 1., nan, 4., 6., nan, 6., 7., 8., 9., 10.]) >>> p = p[~np.isnan(p)] >>> p array([ 3., 4., 5., 5., 6., 9., 10.]) >>> s = s[~np.isnan(p)] >>> s array([ 1., nan, 4., 6., nan, 6., 7.]) >>> p[~np.isnan(s)] array([ 3., 5., 5., 9., 10.]) >>> s[~np.isnan(s)] array([ 1., 4., 6., 6., 7.]) >>> p[~np.isnan(s)]*s[~np.isnan(s)] array([ 3., 20., 30., 54., 70.]) it follows the steps as the code. as you can see, it gets a wrong result. my code is like this: temp_prefs = [~np.isnan(prefs)] temp_similarities = [~np.isnan(similarities)] noNaN_indices = np.logical_and(temp_prefs, temp_similarities) prefs_sim = np.sum(prefs[noNaN_indices[0] == True] * similarities[noNaN_indices[0] == True]) similarities = similarities[~np.isnan(similarities)] total_similarity = np.sum(similarities) with the same example: >>> pp = np.array([np.nan,3,4,5,np.nan,5,6,np.nan,9,10]) >>> pp array([ nan, 3., 4., 5., nan, 5., 6., nan, 9., 10.]) >>> ss = np.array([1,np.nan,4,6,np.nan,6,7,8,9,10]) >>> ss array([ 1., nan, 4., 6., nan, 6., 7., 8., 9., 10.]) >>> tss = [~np.isnan(ss)] >>> tss [array([ True, False, True, True, False, True, True, True, True, True], dtype=bool)] >>> tpp = [~np.isnan(pp)] >>> tpp [array([False, True, True, True, False, True, True, False, True, True], dtype=bool)] >>> nonNaN = np.logical_and(tss,tpp) >>> nonNaN array([[False, False, True, True, False, True, True, False, True, True]], dtype=bool) >>> ss[nonNaN[0] == True] * pp[nonNaN[0] == True] array([ 16., 30., 30., 42., 81., 100.]) as you can see, it gets the right answer. if i misunderstood, please let me know. Thank you in advance. Best Wishes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update classes.py in knn #103

Update classes.py in knn #103

Commits on Jan 3, 2015

Update classes.py in knn #103

Are you sure you want to change the base?

Update classes.py in knn #103

Commits on Jan 3, 2015