GraphChi  0.1
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Macros
Classes | Typedefs | Functions
als_vertices_inmem.cpp File Reference
#include <string>
#include <algorithm>
#include "graphchi_basic_includes.hpp"
#include "als.hpp"

Classes

struct  ALSVerticesInMemProgram

Typedefs

typedef latentvec_t VertexDataType
typedef float EdgeDataType

Functions

int main (int argc, const char **argv)

Detailed Description

Author:
Aapo Kyrola akyro.nosp@m.la@c.nosp@m.s.cmu.nosp@m..edu
Version:
1.0

LICENSE

Copyright [2012] [Aapo Kyrola, Guy Blelloch, Carlos Guestrin / Carnegie Mellon University]

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

DESCRIPTION

Matrix factorizatino with the Alternative Least Squares (ALS) algorithm. This code is based on GraphLab's implementation of ALS by Joey Gonzalez and Danny Bickson (CMU). A good explanation of the algorithm is given in the following paper: Large-Scale Parallel Collaborative Filtering for the Netflix Prize Yunhong Zhou, Dennis Wilkinson, Robert Schreiber and Rong Pan http://www.springerlink.com/content/j1076u0h14586183/

Faster version of ALS, which stores latent factors of vertices in-memory. Thus, this version requires more memory. See the version "als_edgefactors" for a low-memory implementation.

In the code, we use movie-rating terminology for clarity. This code has been tested with the Netflix movie rating challenge, where the task is to predict how user rates movies in range from 1 to 5.

This code is has integrated preprocessing, 'sharding', so it is not necessary to run sharder prior to running the matrix factorization algorithm. Input data must be provided in the Matrix Market format (http://math.nist.gov/MatrixMarket/formats.html).

ALS uses free linear algebra library 'Eigen'. See Readme_Eigen.txt for instructions how to obtain it.

At the end of the processing, the two latent factor matrices are written into files in the matrix market format.

USAGE

bin/example_apps/matrix_factorization/als_edgefactors file <matrix-market-input> niters 5


Typedef Documentation

Type definitions. Remember to create suitable graph shards using the Sharder-program.