But there are an infinite number of input parameters, as the input may be Continuous.
Given an output Y, all input parametersare independant of each other, i.e
$P(X|y_i) = \prod_j P(x_j|y_i)$
By modelling $P(x_j|y_i)$ as a Gaussian function, we have a total of $2n$ parameters for every $y_i$. In the case of the binary output, there are a total of $2n + 2n + 1$ parameters only, which reduces the complexity of the problem. $P(x_j|y_1) = 2n$ $P(x_j|y_2) = 2n$ $P(y_1) = \theta,\ P(y_2)=1-\theta$
This is called the Naive Bayes algorithm.
Since we are looking at the joint probability distribution, and if we know it, we can generate a sample. Hence this model is also known as a generative sample.
We can also model the entire problem as an N dimensional Gaussian without the [#Conditional Independance Assumption].
For this, we need an N dimensional $\vec{\mu}$ and a covarience matrix which has $\mathcal{O}(n^2)$
So here, we have a total of $\mathcal{O}(n^2)$. Hence, [#Conditional Independance Assumption] makes the parameter space linear in $n$.
Basically, constraining the Hypothesis space reduces the complexity of the ML problem.
We can further simplify the model further, by considering
$P(y_i|X) = \frac{1}{1+\exp(\sum\beta_i x_i)}$
This model has only $n+1$ parameters.
P(y_1|X) = P(y_2|X) $\implies 0 = \sum \beta_i x_i$
That is, our discrimination boundary is a plane. If the data cannot be cut by the plane, then we cannot use this assumption.
Constraint defined on our hypothesis set is known as a Language Bias, and it is our choice which Language Bias to pick.