We describe a mathematical framework of the recently proposed Self-consistent Clustering Analysis (SCA), and Virtual Clustering Analysis (VCA), designed for accurate and efficient numerical homogenization. In an offline stage, a few fine scale simulations are performed for the representative volume element (RVE), and the fine cells are categorized into clusters by machine learning techniques. Then in an online predictive stage, we transform the governing differential equation into an integral one (the Lippmann-Schwinger equation), and solve it under the assumption of uniform response for each cluster. This substantially reduces the computing cost. Convergence in one space dimension is proved rigorously, while numerical simulations illustrate accurate homogenization results.