Key-switching matrices.
There are basically two approaches for how to do key-switching: either decompose the mod-q ciphertext into bits (or digits) to make it low-norm, or perform the key-switching operation mod Q>>q. The tradeoff is that when decomposing the (coefficients of the) ciphertext into t digits, we need to increase the size of the key-switching matrix by a factor of t (and the running time similarly grows). On the other hand if we do not decompose at all then we need to work modulo Q>q^2, which means that the bitsize of our largest modulus q0 more than doubles (and hence also the parameter m more than doubles). In general if we decompose into digits of size B then we need to work with Q>q*B.)
The part of the spectrum where we expect to find the sweet spot is when we decompose the ciphertext into digits of size B=q0^{1/t} for some small constant t (maybe t=2,3 or so). This means that our largest modulus has to be Q>q0^{1+1/t}, which increases also the parameter m by a factor (1+1/t). It also means that for key-switching in the top levels we would break the ciphertext to t digits, hence the key-switching matrix will have t columns.
A key-switch matrix W[s'->s] converts a ciphertext-part with respect to secret-key polynomial s' into a canonical ciphertext (i.e. a two-part ciphertext with respect to (1,s)). The matrix W is a 2-by-t matrix of DoubleCRT objects. The bottom row are just (pseudo)random elements. Then for column j, if the bottom element is aj then the top element is set as bj = P*Bj*s' + p*ej - s * aj mod P*q0, where p is the plaintext space (i.e. 2 or 2^r, or 1 for CKKS) and Bj is the product of the digits-sizes corresponding to columns 0...i-1. (For example if we have digit sizes 3,5,7 then B0=1, B1=3, B2=15 and B3=105.) Also, q0 is the product of all the "ciphertext primes" and P is roughly the product of all the special primes. (Actually, for BGV, if Q is the product of all the special primes then P=Q*(Q^{-1} mod p).)
In this implementation we save some space, by keeping only a PRG seed for generating the pseudo-random elements, rather than the elements themselves.
To convert a ciphertext part R, we break R into digits R = sum_j Bj Rj, then set (q0,q1)^T = sum_j Rj * column-j. Note that we have <(1,s),(q0,q1)> = sum_j Rj*(s*aj - s*aj + p*ej +P*Bj*s') = P * sum_j Bj*Rj * s' + p sum_j Rj*ej = P * R * s' + p*a-small-element (mod P*q0) where the last element is small since the ej's are small and |Rj|<B. Note that if the ciphertext is encrypted relative to plaintext space p' and then key-switched with matrices W relative to plaintext space p, then we get a mew ciphertext with noise p'*small+p*small, so it is valid relative to plaintext space GCD(p',p).
The matrix W is defined modulo Q>t*B*sigma*q0 (with sigma a bound on the size of the ej's), and Q is the product of all the small primes in our moduli chain. However, if p is much smaller than B then is is enough to use W mod Qi with Qi a smaller modulus, Q>p*sigma*q0. Also note that if p<Br then we will be using only first r columns of the matrix W.