Notes

Distribution

Fri, 03 Apr 2026 14:27:11 GMT

random variable: map random event to real number.given any random event, measure the possibility.we have many methods to represent the distribution: Cumulative Density Function: Possibility Density Function: Possibility Mass Function: according to Continuous Functions > Fundamental Theorem of Calculus, we can briefly define CDF asThus:median of a distribution is defined asdiscrete: the value that maximize the probabilitycontinuous: the value that maximize probability desnity: pathological: runs counter into intuition well-behaved nice: does not run counter into intuition

Poisson Process

Fri, 03 Apr 2026 02:28:17 GMT

Models: Bernoulli: distribution of binary outcomes (happened/unhappened) experience, the parameter can be explained as its expectation / possibility. Binomial: number of times happened in i.i.d. Bernoulli experiments, parameterized by , the possibility of happening in each experiment. Its formula is form by accumulate all situations of the specific times of experiment outcome is happened. Poisson: if the possibility is evenly distributed on a continuous space, and in that space the rate of arrival / number of happening is expected to be , then for each subspace, we test it and the possibility of happening on this subspace is , then the number of happening in this whole space can is the limit of Binomial Distribution. Exponential: Possibility of time of "First Happen". Rate of arrival in a time unit is , then the rate of arrival is in space , and possibility of nothing happened for time units is simply calculated by plug into the PMF of Poisson, transform, take derivative, and boom, we get PDF of exponential distribution. Gamma Distribution: extension of Exponential Distribution: the Possibility of time of "n-th event happen". 单次概率每次概率，次出现次的概率Why it's called binomial distribution?当随机过程被称为Poisson过程：从零开始：零时刻事件发生次数为0 无记忆：无论从哪里开始计时，分布都一样独立性：事件的发生互不干扰稀疏性：同一时刻发生两个事件的概率几乎为0 I think it could be written as: Notation Misuse 可以看作一个随机变量，代表的是时间内事情发生的次数。是时长，而不是时刻，但默认时刻从零开始的时候，又可以指代这个时长，此时为时刻。 Poisson Process: 随机事件在连续时间内发生的基础模型发生率：单位时间发生率计数视角：单位事件发生的次数分布（离散）possion分布间隔视角：相邻两事件的时间间隔（连续），指数分布等待视角：从零时刻到第n个事件发生所经历的总时间（连续）gamma分布 how to calculate the possibility of happening in a time interval, or formally how to calculate ?define： and we want:Approach: use binomial distribution.Binomial Distribution describe the possibility of happening out of tests, if we divide a space (for example, the unit time interval we are interested in now) into small subspace, see if the event happen and take the limit of , we got the possibility of happening in that continuous space.as we divide the time unit into small even subspaces, as , every variable is Bernoulli distributed as defined by Poisson process:and thus the event "happening times in a time unit" becomes "happening times in tests, ", which makes the possibility being able to calculated by PMF of Binomial distribution.but what's ?
As all "test results" are i.i.d and Bernoulli distributed, , so , andlet's said the is the expectation of , then .so:Apparently, the expectation of Poisson distribution should be :The Variance of Poisson distribution is also :and is called "arrival rate".What's the possibility of wait time? Or, since wait time is a continuous variable, what's the possibility of wait time is greater than some threshold ? Or formally: define as the wait time, what is ?if it's a poisson process, then basically it's saying that you wait and nothing happen, that event is .And the possibility has no memory:It's straightforward to get the CDF and PDF of :CDF:PDF:Pick the right tool These function -- PDF, CDF -- are just different aspects of how a random variable is distributed. You should pick the right tool to solve the problem you currently tackle with. If a variable is describing the time for i.i.d. events to happen, and this is a Poisson process, then follows as Deduction: from several general to a more specific conclusion Induction: from special cases to general form For an backup system to survive for , the possibility is Take the opposite event -- you wait for no more than time units, and more than events happen in -- this is the CDF of your wait time:take the derivate, you get an alternating series, all terms except the first term cancels out, so the PDF isExplanation:
the Gamma Function: scale parameter: , actually it's the reciprocal of arrival rate shape parameter: , actually it's the number of events to wait.

Financial Independence

Tue, 31 Mar 2026 18:12:59 GMT

As an independent citizen in a thriving country, you earn salaries, pay checks for your family and invest your savings to compound the profits.Your dream is that one day, you don't have to worry about losing the job -- the passive income can still cover your paycheck. And more ideally, your savings earn as much as you can -- you can just retire early!What you would need to do is: Save as much as possible, live in low cost Try to earn money from the market, create passive income and make the best effort to avoid losing money. But does it really work? Let's solve it by math and visualize it.you earn money and spend money every year, so you save per year, the earning and expense increase with rate per year, and so as . The feasible investment return is .Question: when you can achieve financial independence?the next year asset :financial independence conditionsthe formula of the formula of condition: the -th year asset should generate enough passive income for expense of year:expense coverage:(cost coverage = 100%)solve for .if :if :the investment return compared with the next year salary:when your investment return reach your salary, you retire. (salary coverage = 100%)Assuming that you graduated from the University as a PhD, and you are 30-year-old now, you just want to know how much save ratio you need to achieve financial independence and how many year would it takes.According to world bank, China domestic savings is of GDP at 2024, due to the public's current pessimistic opinion of to economy, let's set a goal as a saving ratio goal.Let's set your first year salary. This does not matter with your financial independency, but affect your balance after retiring. Assuming that you graduated from a good school and found a good job and earn per month, so your first year income is . So you have a balance of .Assuming that your salary goes up year by year, and since the CAGR of the salary of post-90s is now around , let set it conservatory as for you, one of the Gen Z.As for investment, I hope you won't invest garbage asset and invest those really valuable and realize a reasonable return rate -- 8%.Finally, the simulation said that with saving + investment: it would costs you around 17 years (at your 47's) to let the expense coverage reach 1 -- your assets build a firewall for your life. it would costs you around 25 years (at your 55's)to let the salary coverage reach 1 -- know you can just retire! your asset is 1.58x more than those earn from your boss and you earn dollars.
This model assumes that your salary grow exponentially.But since we're Gen Z, most people can't exponentially increase their salary these days. But surprisingly it also takes years to Phase 1 and years to Phase 2. But this only bring your asset to 4 millions rather than 16 millions.
But as a PhD you're about to reach 8 millions asset and budges per year.

Interactive notebook to visualize your financial plan consequences on google colab

Backups

Tue, 31 Mar 2026 17:02:18 GMT

Backup failure rateif you have one backups (your computer, NAS), then the chance you loss the data is distributed as:Assuming that the data are expected to break years later, and you have a raid of 2/3/4 disk, let's see what's the difference:The time that before all disk failed is distributed as Poisson Process > Gamma Distribution if you don't change your disk it failed.

And it's very unlikely for all disks to completely failed within just one year. The failure time of all disks are exponentially distributed.You single disk RAID have of chance to survive the first year. So just wait for the price to drop to an affordable range!import numpy as np import matplotlib.pyplot as plt from scipy.stats import gamma # Set up the figure with a modern, clean aesthetic plt.style.use('seaborn-v0_8-whitegrid') fig, ax = plt.subplots(figsize=(10, 6), dpi=120) # Define x range x = np.linspace(0, 20, 1000) # Different shape (k) and scale (theta) parameters for gamma distribution # Gamma PDF: f(x; k, θ) = x^(k-1) * exp(-x/θ) / (Γ(k) * θ^k) # or using rate parameter β = 1/θ params = [ (1, 5, 'k=1, θ=5 (Exponential)'), (2, 5, 'k=2, θ=5'), (3, 5, 'k=3, θ=5'), (4, 5, 'k=4, θ=5'), ] colors = ['#FF6B6B', '#4ECDC4', '#45B7D1', '#96CEB4', '#FFEAA7'] for (k, theta, label), color in zip(params, colors): # scipy uses scale parameter (theta) y = gamma.pdf(x, a=k, scale=theta) ax.plot(x, y, color=color, linewidth=2.5, label=label) # Add subtle annotation ax.text(5, 0.25, r'$f(x; k, \theta) = \frac{x^{k-1}e^{-x/\theta}}{\Gamma(k)\theta^k}$', fontsize=11, bbox=dict(boxstyle='round', facecolor='wheat', alpha=0.5)) ax.set_xlabel('x', fontsize=12) ax.set_ylabel('Probability Density f(x)', fontsize=12) ax.set_title('Gamma Distribution Probability Density Functions', fontsize=14, fontweight='bold') ax.legend(loc='upper right', framealpha=0.9) ax.set_xlim(0, 20) ax.set_ylim(0, 0.3) plt.tight_layout() plt.show() Run

Variance

Wed, 18 Mar 2026 05:48:06 GMT

applying Expectation > Linearity and Expectation > Distributivity w.r.t addition:Proof:Suppose , is a constant:Proof 1:
Given (Proofs are in Expectation > Linearity and Distribution > Transformation):Thus:Proof2:
with Expectation > LOTUS the proof is simpler:variance between two r.v.also:

Expectation

Tue, 17 Mar 2026 15:56:53 GMT

For discrete random variable For continuous random variableLOTUS (Law of the Unconscious Statistician) Theorem:It's all about sign cancellingthe probability density function is defined as:given , to transform to probability about , slice .if is monotonically decrease in if is monotonically increase in :take derivative of , apply chain rule:and both cases generate the same expression:So for piecewise monotonic : this leads to LOTUS for continuous variable:the potential negative sign of also cancel with the potential flipping of integral limit caused by substituting variables in definite integral：For discrete variable it's pretty intuitive:Since:So:Given by LOTUS and linearity of Riemann Integralapplying LOTUS:In conclusion:Distributivity w.r.t additionExpectation is also distributable w.r.t addition:Independency requiredDistributivity w.r.t. multiplication while independentGiven , Expectation is distributable w.r.t multiplication:

Correlation

Thu, 05 Mar 2026 08:12:21 GMT

Base

Tue, 31 Mar 2026 16:02:10 GMT

default view200 resultsSort0Filter0PropertiesSearchNewShowing 200file name3D Gaussian Splitting2025-10-08 股市分析2025-10-14 技术分析2025-11-24 爆竹投机心得2025-12-29 深圳北站2026.2.9 交易预案Adversarial SearchAliasing and Anti-aliasingAsymptotic NotationsBase.baseBayesian Decision TheoryBig Data ToolsBitcoinCentral Limit Theorem

Home Page

Tue, 31 Mar 2026 15:34:03 GMT

This is my knowledge base. Mathematical Analaysis - foundation of advance topics: real number, limits, continuous function, series... Linear Algebra - tools for manipulating data efficiently. Statistics: tools for analyzing data and making decisions. Artificial Intelligence: Searching Algorithms(A star, CSP, Adversarial Search) Markov Decision Process Machine Learning Principles: Estimation Parameter Estimation(MLE, MAP) Non-Parametric Estimation(KDE) Optimization Methods(EM, Lagrange Multiplier) Decisioning Bayesian Decision Theory Computer Graphics: Object Modeling Rendering Pipeline Ray-Tracing Aliasing and Anti-Aliasing Natural Language Processing: Language Modeling Word Embeddings Transformers

Problem Set 2

Mon, 30 Mar 2026 19:14:14 GMT

Poisson Process > Poisson DistributionPoisson Distribution: the distribution of random variable that measure how many events happen given a period of time with a parameter called arrival rate.Likelihood function of given samples MLE of :find the stationary point:the second derivative of this function is so the stationary point is: , which is the sample mean.Clark's data:, the sample mean is:from scipy.stats import poisson n = 576 mu = 535/n v_f = [] v_p = [] for i in range(0, 6): p = poisson.pmf(i, mu) v_p.append(p) f = p * n v_f.append(f) print("Probabilities: ") for i, p in enumerate(v_p): print(f"P(X={i}) = {p:.3}") print("Expected Counts: ") for i, f in enumerate(v_f): print(f"ec(X={i}) = {f:.3}") Runthe outputs are:Probabilities: P(X=0) = 0.395 P(X=1) = 0.367 P(X=2) = 0.17 P(X=3) = 0.0528 P(X=4) = 0.0122 P(X=5) = 0.00228 Expected Counts: ec(X=0) = 2.28e+02 ec(X=1) = 2.11e+02 ec(X=2) = 98.1 ec(X=3) = 30.4 ec(X=4) = 7.06 ec(X=5) = 1.31 which remarkably close to the actual data, so my conclusion is that the assumption has great chance to be true.

PS-9

Thu, 26 Mar 2026 08:14:30 GMT

PS-8

Thu, 26 Mar 2026 07:32:39 GMT

Newton-Ralphson Method

Thu, 26 Mar 2026 07:26:34 GMT

iterative scheme for finding the zero of a function guess find the root of the tangent line as a new guess go to 2 until convergence

Mathematical Analysis

Wed, 25 Mar 2026 22:57:18 GMT

Integer 整数 Rational Number 商数（有理数）：两个整数的商，分母非零 Real Number 实数：有理数的缝隙，使用Dedekind Cut，能在有理数的缝隙上定义出唯一的实数（Dedekind Completeness Theorem）。实数完备的意义：实数集没有缝隙实数完备性的六大等价定理确界原理（supremum principle）：实数集的任意非空有上界的子集必有上确界（最小上界）单调有界定理：单调有界数列必收敛区间套定理：闭区间套必收敛到唯一实数有限覆盖定理：闭区间的开覆盖必有有限子覆盖聚点定理（Bolzano-Weierstrass Theorem）柯西收敛准则：数列收敛的充要条件是柯西列 Cauchy Sequence: 实数的戴德金定义Dedekind Cut 定义的实数集: 实数集被定义为所有的戴德金分割：partition of into two sets and such that: Each element of A is less than B A contains no largest element Another Interpretation of this definition is: every real number is defined as a subset of rational number set .Proposition：实数集的有序分划（、）确定一个唯一的实数在实数集中，非空有上界的集合，必定存在一最小上界（Supremum）集合中所有实数对应的戴德金分割左集的并集：是一个戴德金分划（是一个实数）是一个有理数集：都是有理数集非空非：有上界，因此，因此，向下封闭性/有序性：任意，如果，那么，又，，所以，或无最大元：是的一个上界：是的最小上界：对于的任意上界，实数即集合在Dedekind Cut的定义里，实数是由分割定义的，因此这个分割的左集——比它小的有理数集合就定义了一个实数。实数次序的定义上确界原理证明思路 Dedekind Cut定义的实数是有理数集的有序分割左集，任意有上界的实数集都能转换为对应有理数分割左集的并集。实数集等价的有理数分割左集并集也是几个有理数集，也对应一个实数，而其为实数集的最小超集，因此当有上界的时候，上确界存在且总是等于实数集等价的有理数分割左集并集。 Dedekind Completeness Theorem 证明思路证明上确界原理：所有有上界实数集都有上确界，从而所有实数集分划左集都有上确界，从而有序分划能唯一确定一实数。 [!note] 为什么确界原理就是实数完备性确界原理说任意有界实数集合都有上确界，就是说实数的任意分划能靠左集的有界性能确定唯一的一个实数。这样实数就不会像有理数一样，分划不能用来确定唯一的有理数，即“有洞”了。有理数的稠密性：在任意两个实数之间，能找到有理数。对于，，所以总有单调有界的数列收敛，存在。根据上确界原理，设的上确界为因此单调有界的数列必收敛，且收敛于上确界闭区间Compactness有且仅有唯一的点，属于所有区间闭区间列，，左端点序列单调递增且有界，因此极限存在且右端点序列同理，存在，且，由极限运算法则和区间长度趋向零得：同时小于等于所有，同时大于等于所有，因此唯一性：如果有两个都在内，那么左右端点序列就有不同的极限，与区间长度趋向零矛盾。因此唯一。开区间套不能确定唯一的点：内的任意总存在足够大的使得从而闭区间 Compactness Open Cover：集合的开覆盖，，是开区间 Finite Subcover 闭区间存在开覆盖，则有有限覆盖：假设不能被有限覆盖，二分，选择一不能被有限覆盖的区间作为，继续此操作构建闭区间套，从而由闭区间套定理，确定唯一的实数，而由于覆盖，，由于开覆盖，，从而被定义成不能被有限覆盖的区间被有限覆盖了，矛盾。聚点定理有界数列必有收敛子列有界无穷数列必有聚点（收敛子列）证明令，为二分的中包含无穷项的一半，以此构造闭区间套，从而能确定唯一的点。从每个中选择，则由夹逼定理：柯西收敛准则实数数列收敛的充要条件是该数列为柯西数列(Cauchy Sequence): proof:必要性显然，数列若收敛，必为柯西列充分性：柯西列有界，因此存在收敛的子列收敛于 Bolzano-Weierstrass Theorem，选取子列时下标随着增大而增大且因此对于任意，找到使得柯西列差值的距离、收敛子列与a距离都小于的，选择，则 Real NumberIf by selecting a small punctuated neighborhood of , a variable determined by can close enough to a fix value in any level, then that variable is converge to that fix value.is saying:the limit of of as approaches it means that the value of the function can be made arbitrary close to by choosing x close to is saying:人话：所有路径都收敛于一个值，那么就在该区域无死角收敛了。证明：如果函数收敛，自然每一条路径都收敛如果函数不收敛，那么能够构造不收敛的坏数列 Sign-Preserving Property If a function has a limit on , then there's a punctured neighborhood of , one that set, keeps the same sign with the limit Suppose Set , then , , So as when Squeeze Theorem if , and and has same limitation on , then also has limitation on Order Property of Limits If , and , then if , then for , :which contradicts with . So Local Boundness If , is bounded in some punctured neighborhood of swappable to scalar productProofwith , , thus:distributive to function additionProof: (triangular Inequality)with , with , Thusdistributive to function productProof: involve a trick of constructing and
be aware that is locally bounded due to the Local Boundness : with , , thus with , so, with , distributive to function devisionProofkey steps:step1: Proof
step2: use distributive to function product:Detail ProofThis proof is divided into two steps: proof and reverse triangular inequality product rule with , , which saids: with , Thus:Thus:applying the product rule:composition rule if is continuous, then Limitsif a function is continuous on , it's saying that for any , s.t.:If a function is continuous on a close-interval , then its bounded on that interval.
Proof with Bolzano-Weierstrass Theorem.Suppose that: the function boundless on If the function is boundless, loop this process, starts with : bisect the interval with select the boundless half or select a point making this half boundless to build a sequence continue this process with And finally we have a infinite bounded sequence and an unbounded sequence s.t.:
converge to some number (Bolzano Weierstrass Theorem)
is unbounded, which contradicts to (Heine Theorem) So the function is bounded on IVT If a function is continuous on a close-interval , then for any intermediate value , there's some value making that ProofBuild nested intervals with this process: bisect the interval with if , stop. select the half making continue this process
So we get a nested interval , which locate exactly one real number , and (Nested Interval Theorem). As the function is continuous on , with Heine's Theorem: So we have another set of nested intervals that locate a unique number, which can only be since . (Nested Interval Theorem)Extreme Value Theorem If a function is continuous on , then there's a maxima and minima in .
As Boundedness Theorem said, the function is bounded, so the function has a supremum and infimum, w.r.t. Dedekind Completeness Theorem. If the supremum is .Suppose . is also a continuous function, thus it's also bounded. Suppose is an upper bound of on :so is also an upper bound for on , which contradicts to the setting that is the supremum. So there must be some value s.t. .Fermat's Lemma derivative on maxima / minima must be zero Proof:Suppose is a maxima.It said that:Rolle's MVT if function is continuous on and , then:
A function is continuous on , then with Extreme Value Theorem, there must be a maxima , and then with Fermat's Lemma, Lagrange MVT If function is continuous on , then: , which saids the derivative on some intermediate point concludes the change of endpoint. Proof:
build a flat function containing to apply Rolle's MVT:So that So with Rolle's MVT, Cauchy MVT if function are both continuous on , then: construct a function whose derivative is in the form that fit the theorem:
so by Rolle's MVT:MVT for Integrals If function is continuous on , then the mean integral is an intermediate value of
is continuous on , so with Boundedness Theorem, is bounded:
then with IVT:if is continuous at accumulation function is one of the antiderivativeIntegration is difference:Proof:
Prove it by MVT for Integrals:So any original function of , denoted by , can be represented by adding some constant to :substituting with , we know the constant is substituting with , then the integration of from to is the difference of the value of the endpoints of any antiderivative : it must be continuous to be derivable, or equivalently, differentiable.if is differentiable, then is also an infinitesimal -- which implies that is continuous.Continuous FunctionsSeries is an infinite sum, or the limit of Partial Sumsum of numberssum of functionsGeometric Harmonic P-Series Alternating Power series: series of power functions: sum of powers of Tests: Ratio test Integral test Compare testTo proof the image of the function is the limit of the Taylor series, write the series in the form of the sum of partial sum and Remainder:odd and even numbers:product is odd only when all factors are oddsum is odd only if there’s odd number of odd termsProofGiven:So:distribute calculating limitation w.r.t additionSeries为什么用单位圆和直角坐标系定义三角函数用直角三角形定义的话角度被限制在0至，用角边和单位圆交点坐标定义则能轻松将三角函数的定义域扩展到任意实数值。关于我们现在不把它当作圆周率，而是平角的角度由平面直角坐标系单位圆与x轴正半轴交于点，以直线为一边，另一边与单位圆交于，且的坐标为，则角的正弦和余弦值为。换句话说，正弦函数和余弦函数将角的大小映射到的坐标分量上。的坐标也可记为Note 将所有常数写在前面方便后续套用改变运算顺序或消掉常数大小为的角与单位圆的交点与关于x轴对称，因此有：大小为的角与单位圆的交点与关于原点对称，由此有：大小为的角与单位圆的交点与关于对称，由此有：仅使用勾股定理，分锐角三角形、钝角三角形、直角三角形三种情况（钝角三角形证明部分需要利用诱导公式）可证明对于任意由点ABC围成的平面三角形，记角ABC的对边长度为，角A大小为，有：作图，作出大小为, , 的角与单位圆的交点，利用两个角度大小为的弓形的弦长相等得Note 这样的标号应看作一个复合函数在的值其余和差公式的推导不过是运用实数加减法运算性质、余弦差公式、和诱导公式由于正弦余弦函数和差公式的齐次性，可得的和差公式令，代入上面的和差公式容易得到：将倍角公式反过来就是半角公式：三角函数值的积可以化为三角函数值的和可以发现和差公式的右侧的两项是很容易消去的其中一项的，因此能够有积化和差公式：既然积能等于和差，那么和差也能等于积。换元：辅助角公式是用反正切函数和正弦和角公式，将任意角度的正弦和余弦值的任意线性组合化为一个单一的正弦函数：defined the and as a form of Tayler Series, with the heuristic of their geometric definition:Trigonometry FunctionsBig : The Ceiling means growth rate is upper-bounded by This saids: there exists a constant and a constant , as , , or, is bounded by Prove that as , , which saids , it's bounded by Big Theta Notation: Same Growth Rate implies has exactly the same growth rate as saids that , as long as little : strict ceiling means becomes insignificant compared to or rules for little o: multiplying an constant doesn't change its order: multiplying an infinitesimal make it an even higher order infinitesimal: is or adding a higher order of infinitesimal doesn't change its order: if , then little means strict, big means no ... than is "o" means upper-bound, is "omega" means lower-bound, is "theta" is the same order as Asymptotic Notations and are equivalent infinitesimals, written as saids thatoscillating function and should not cross zero "near" , otherwise, the quotient is not defined.
As for Product Rule, so the equivalent relation has transitivity:Proof:InfinitesimalsRiemann SumRiemann Integral and , as long as , then Then is the limit of Riemann sum
This is due to the definition of Riemann integral and linear properties of limits. Limits > Rules with Real Number OperationsRiemann Integral推广指对数函数到实数定义因为该变上限积分有以下和有理数指对数相似的性质：再用换底公式来定义，并定义为的反函数，则由以上结论容易有指数函数也符合幂乘的运算法则：因此有：自然底数定义底数使得，即由导数定义：以及重要极限的存在性，因此：称为自然对数的底数Exponential and Logarithmic FunctionsTaylor Series: simulating utilizing its behavior on Fundamental Theorem of Calculus
Integrate by parts Integration skills > integration by partsIntegrate by parts:Suspect that:Inductive Reasoning
Maclaurin Series is simply Tayler Series center at :
By MVT for IntegralsSo we can derived Lagrange Remainder from the Integral Remainder:knowledge: So we know well about around , and thus can compute around .from .factorial import factorial DERIVATIVES = [0, 1, 0, -1] pi = 3.141592653589793 def sin(x: float) -> float: if x < 0: return -sin(-x) if x > 2 * pi: return sin(x % (2 * pi)) res = 0 for n in range(100): res += DERIVATIVES[n % 4] * (x ** n) / factorial(n) return res Tayler SeriesDedekind DefinitionMonotone Convergence Theorem

Discriminative Learning

Thu, 19 Mar 2026 15:44:46 GMT

Linear Classifiers Learn CCDs from data: BDT to get decision rule Data used in step 1, decision is secondary. Density estimation is an ill-posed problem (difficult) which density to use? Vapnik's advice when solving a given problem, try to avoid solving a more difficult problem as an intermediate step. Solve for decision rule directly input: output: find a linear function: separate input space into 2 halp-spaces points to into the pos-space is the dec boundary Decision Rulebias term can be included in Training Set Given a : correctly classified misclassified Idea case: 0-1 loss failed because has only 0 or undefined gradient.FLD is a version of LSCRosenblatt 1962criteria: only look at misclassified pointsM = set of misclassified points = loss function, larger loss for for misclassified points far from boundary:Perceptron Algorithm: look at one sample at a time and minimize (gradient descent)it's now called stochastic gradient descent, SGDHow to set the learning rate rotate towards the misclassified point length of increases in each iteration, each update has less effect than prev
Rosenblatt proved SGD converges in iterations if the data is linearly separate : for , : the optimal unit weight vector that perfectly separates the two classes with the maximum possible margin. many possible solution, based on initialization does not converge if data is not linearly separable. (probabilistic Approach)Binary class: PS6-7: when the CCD are Gaussian, , the posterior is is a sigmoid functionwhere f(x) is linearSigmoid:with BDR, was determined by CCD. Now we directly learn linear func: prob: Decision Rule:Look at # parameters BDR-Gauss: Logistic Regression: , less likely to overfit. Learning: parameter estimationlet Bernoulli likelihood: Data log-likelihood:MLE:Find zero-crossings of gradient: Newton-Raphson Methodweighted lesat squares: weights are , target is R: weights depend on , weight higher on unconfident predictions z: - error between pred and target depends on IRLS, IRWLS: iterative reweighted least squaresall have the form:"empirical risk minimization" - reduce the training error.let Ideal 0-1 lossLSC:penalizing too correct answersPerceptron:
Logistic Regression: some loss for correctly classified points: the effect is to push. the boundary away from the nearby points.
Loss for LSC & LR are convex approx to 0-1 loss

IMG_1498

Thu, 19 Mar 2026 13:51:49 GMT

IMG_1495

Thu, 19 Mar 2026 12:52:44 GMT

Fisher's Linear Discriminant

Thu, 19 Mar 2026 11:58:50 GMT

class mean: class scatter:why not inner product?goal: the find the optimal project that maximize the between class distance and within class distance:within class scatter:

PCA

Thu, 19 Mar 2026 11:44:53 GMT

Bayesian Decision Theory

Sun, 15 Mar 2026 23:50:35 GMT

optimal decisions in problems with uncertainty states: prior: = probability of a state observer that measures features from r.v. class conditional distribution -- conditional on state (class) one CCD for each states Decision function: use observation to make a decision about the state Loss function - penalizes for deciding the wrong (the state) 0-1 Loss: Risk - expected value of the loss functionSince , , then minimizing the risk can be achieved by minimizing the conditional risk for each .For a particular , choose a class that minimize the risk:This is the Bayesian Deciison Rule.settings:conditional risk:BDR:Equivalently:Example - 2 class classificationgiven : pick if , evquivently:Summary:for 0-1 loss function BDR is the MAP rule (tells threshold for Likelihood Ratio Test) Risk = prob of error BDR minimizes prob of error (no other decision rule is better) caveats: assuming our models are corect! (the CCD and the prior) This is called a generative model Use data to learn the CCDs (modeling how features are generated) use the CCD in decision rule decode : Goal: given , recover the bit Model: prior: CCD: assume gaussian additive noise: BDR for 0-1 Loss:Hence: pick when: CCD are Guassians No assumption on prior Special case: Assume (shared isotropic covariance) (discriminant function) define a hyper plane passes through that is normal to Goal: Find optimal for the given assumptions (prior, CCD, Lossfunction)Mode, Mean, MedianLoss function for every predict-true value pairConditional RiskGiven x, the risk of the system is:Total Risk:choose the action that gives the smallest possible value for likelihood, loss, priorTo minimize total risk, minimize conditional risk.LRT:L(g(x), y) p(y|x) volume of sphere = Gamma Function > Gamma FunctionVolume of hyper cube: corner vector , and axis vector hypersphere shell of thickness outside sphere , inside sphere all the volume is inside the shell
Central Limit Theorem the sum / average is concentrated around the mean s as increases, the converges to Thus length of almost all samples vectors will be Thus in high dim, a Gaussian is a shell of sphere and most of the density is in this shell. The point of the max density is still the mean (0)In theory, adding new features will not increase informative feature informative features uninformative feature CCDs still overlapquality of CCD estimates density estimates for high-dim need more data! e.g. high-dim histogram on average suppose we want 1 sample / bin In general, desired training set size = , is the number of parameters.CCD CCD = Class Conditional Desnity P(x | C_i) reduce number of parameters (complexity of model) reduce number of features (dimensionality reduction) reduce # parameters create more data Bayesian formulation (virtual samples) data augmentation summarize correlated features with fewer features How? The data lives in a low-dimensional subspacelinear operationIdea: if the data lives in a subspace it will look flat in some directions.if we fit a Gaussian, it will be highly skewedeigen-decomposition: each defines an axis of ellipse each defines the width along axis. keep the large eigenvaluesselect with the largest eigenvalues (principle components) to find the subspace where data "lives"Receipt of PCA: calculate Gaussian: , eigen decomp: sort eigenvalues for largest to smallest select top-k eigenvectors: project onto : as new feature vector, BDR as usual Notes: This selection of maximizes the variance of the projected training data minimizes the reconstruction error of training data can be implemented efficiently using SVD pick a that works pick to preserve of variance of dataAssumption - "noise" variance is smaller than the signal variancePCA is optimal for representation (but not necessarily for classification)no way to fit it! (we don't use class information) Find a linear projection that best separate the classes input space (x)class mean: 1-d space (z)Idea: maximize distance between proj.problem: is unconstrained need normalizationFisher's Idea:高内聚，低耦合

PS-6

Wed, 11 Mar 2026 19:01:34 GMT

Determinant

Thu, 05 Mar 2026 19:28:41 GMT

方程组是什么？方程（Equation）就是等式，一次方程的标准形式为等式右边为常数项，左边为未知数项。方程组就是方程的组合。方程组的解是什么？就是一组未知数的赋值组合 s.f.t. 同时满足方程组里的所有方程。无解是什么？无解等价于任何一个未知数的赋值组合都无法同时满足方程组里的所有方程。高斯消元法的目的是将方程组变形成等价的（有相同解集），且每一行、每一列只有至多一个非零系数的方程组。方式有几种：方程左右同乘一个非零常数将一个方程成一个常数后加到另一个方程方程调换位置对应到系数矩阵上的操作就是“基本行操作”。高斯消元法的思想是：将解方程组分为前向消元、回代求解两个步骤。前向消元过程中，从上到下、从左到右选择主元，每次选择一个主元，就将它下方的所有元素消去，直到最后一行。消元会消出零行或者零列：零列和零行对解的关系：有零列，则说明该列代表的未知数可以任意赋值，因此方程对任意有无数个解有零行，则若该行上，则等式冲突，因此方程对一些无解，即不能保证有解行列数量关系：列比行多，则要么有零列，导致无数个解，要么解中带未知数，也有无数个解行比列多，则必须有零行，导致无法保证有解行列相同，则有机会没有零行、也没有零列，且零行和零列必定同时出现，因此此时对任意数有解且只有唯一解。消去y三元方程组要解出x解：若要确保有解，则需要的系数不为0。而随意进行行行、列列调换，这个代数表达式都只是有正负的区别，因此只要这个数不为零，方程组就对任意有唯一解。因此三阶行列式为：对于的方阵，观察二阶和三阶行列式，发现在时，行列式的每一项都是来自行的个不同列的元素相乘，而且穷尽了所有列序号的组合，且这些列序号的组合没有重复，且项的正负和列序号排列的逆序数有关系 Permutation > Transposition，因此猜测对于所有的方阵，其对任意有解的条件是：穷尽来自不同行、列的个元素的积，以作为系数相加，和不为0。定义行列式为 Leibniz formula for determinants这样定义的行列式有几个性质：基本行操作可以表示为左乘，行交换增加了一次对换，因此；行乘倍，则每个因子都乘倍，因此行倍加，则行列式可分解为原行列式加上一个带重复行的行列式，而带重复行的行列式的项正好两两相消，因此行列式为0，因此靠行操作把化为，这个过程可以表示为：因此也可以表示成基本行操作矩阵的乘积：假设因此积的行列式是行列式的积证明：这里理解的难点在于符号，即为什么是。注意到符号是由逆序数决定的，而子式里排列为，则按以前的排列：对于子式，它计算的，用的是后行和列向前偏移一位。因此换的要偏移一位，被换的也要偏移一位，才能从子式的排列恢复到原式的排列。先将映射到，然后应用扩展的子式的排列因此：因此和有以下关系：Cramer's Rule says if determinant 0 then invertible证明：因此，implies that 对任意，有唯一解但若，对特定的就有可能导向无解了。Note 在有了Cramer's Rule后，方程组有解的逻辑就从”高斯消元前向消元“变为了”Cramer's Rule“ 剪切不变性：一行乘一个数加到另一行行列式不变共线坍缩：存在共线向量的时候行列式坍塌到0 线性伸缩：任意一个向量放大倍，行列式也放大倍方向性：交换任意两个，行列式变号个维向量的叉积定义为行列式的形式：符合2、3维的情况，大于3维是大小保持以上性质，方向上保持是法向量个维向量的行列式就相当于其中个向量的叉积与剩下的1个向量相乘（行列式的定义），几何上对应体积的概念。

PS-5

Wed, 04 Mar 2026 19:12:43 GMT

PS-4

Wed, 04 Mar 2026 19:12:06 GMT

PS-3

Wed, 04 Mar 2026 19:09:08 GMT

PS-2

Wed, 04 Mar 2026 14:54:05 GMT

Problem Set 1

Wed, 04 Mar 2026 14:19:20 GMT

Pasted image 20260304064307

Tue, 03 Mar 2026 22:43:07 GMT

Pasted image 20260304060513

Tue, 03 Mar 2026 22:05:13 GMT

Pasted image 20260304060504

Tue, 03 Mar 2026 22:05:04 GMT

PS-1

Tue, 03 Mar 2026 15:03:10 GMT

Kernel Density Estimation

Sat, 28 Feb 2026 19:34:11 GMT

center the kernel to the data points, and take average.The estimated distribution of the data can be used to: bootstrap an algorithm help make decision

Maximum A Posteriori

Sat, 28 Feb 2026 19:16:57 GMT

Prior: the initialized parameter for the assumed distribution of parameterGiven a dataset, adjust the distribution of parameter. Find the parameter that maximizes the probability of the parameter. It can be written as:

Lagrange Multiplier

Sat, 28 Feb 2026 19:05:14 GMT

如果有一个约束条件，要求求出满足约束条件的的极值点，那么把约束条件作为一个项构造一个连续的函数直观意义是决定了最优的点集（惩罚为0），和在这些落在其他点的时候的惩罚力度。拉格朗日乘数用控制限制力度。求驻点：让梯度为0求驻点，这样解出来的驻点既满足约束条件，且点上的的梯度和约束的梯度共线。

Maximum Likelihood Estimation

Sat, 28 Feb 2026 18:58:29 GMT

Idea of MLE: "likelihood" is the probability of the dataset appears, so the parameter that maximizes the likelihood might be the true parameter.Step 1, make assumptions: assume the distribution is ..., and the PDF/PMF is Now estimate the parameters by maximizing the likelihood.Likelihood: the product of the possibility / possibility density of the data / dataset.Given a sample dataset , the likelihood is a function of :Since log is monotone, sometimes we use log-likelihood:Find the that maximize the possibility of occurs:the optimized is converge in probability to the true parameter 估计误差的分布会趋近于正态分布The MLE of is Define the likelihood of as the supremum of with :And the optimized should be the one that makes be supremum of , which is the supremum of :and the likelihood of is:which means , so can be

Permutation

Sat, 28 Feb 2026 17:09:45 GMT

一个在上定义的Permutation可以表示为一个的双射Cauchy's two-line notationone-line notationa -cycle is represented by and a cycle notation is writing a permutation as all cycles of the permutation:how to write down the cycle notation: Pick one element that have not been picked, and build the cycle. With the cycle notation, every element in the cycle maps to that unique cycle. 1-cycles are omitted. Canonical Cycle notation in each cycle the largest element is listed first the cycles are sorted in increasing order of their first element inverting the permutation: inverting the cyclesfunction composition of permutation: associative: identity permutation: each permutation has an inverse that the order of a permutation is the smallest positive number that The number is the least common multiple (lcm) of the lengths of its cycles, so every elements cycles back to its original position.every permutation of a finite set can be expressed as the product of transpositions since every cycle can be composed by transpositions:For example:proof: , and , and and:so any cycle can be decomposed into parities:And:Proof: both and can be decomposed to transpositions. so can also be decomposed to the compositions of these transpositions. odd + odd is even, odd + even is odd, even + even is even. This match with (-1)(-1) = 1, (-1)(1) = (-1), (1)(1) = 1. This is what to be proven. Transposition a transposition is a single 2-cycle

Expectation-Maximization Algorithm

Sun, 22 Feb 2026 13:34:55 GMT

E-step: estimating missing data with current parameters M-step: maximizing likelihood to update parameters

Problem Set 3

Tue, 31 Mar 2026 13:51:45 GMT

Integrate by parts:Gamma Functionlet has an exponential densitythe prior

Gaussian Integral

Tue, 31 Mar 2026 13:47:34 GMT

Integration of Gaussian Function , a.k.a. Euler-Poisson integralit's a constant , you can get this by Integration by substitution

Gamma Function

Tue, 31 Mar 2026 13:34:33 GMT

Euler found that the integral has the property when he takes the integral by parts:let , and define redefine as , and we have:Tips here is a real number. I use the real number definition of power function here. Integrate by parts:let , and you would find that it's Gaussian Integral

Gaussian Distribution

Tue, 31 Mar 2026 13:31:57 GMT

Gaussian Distribution Named after Gaussian, with some assumptions on error, describe the uncertainty of samples. sum up to zero (average of samples is the true value) if errors are independent Suppose: the probability (density) function of error is the true value is the observed values are the errors are Then the Maximum Likelihood Estimation of the true value give the first constraintThis process says: to maximize the likelihood of , function should sums up to .Now with the assumption that all errors sums to , we have two constraints: can be one solution, thus:Solve from the differential equation:So the distribution of error is parameterized by and , and the parameterized probability density is:why not ? IDK, TODO As probability density is the derivative of accumulation function, integration of on should be :
with Gaussian Integral:with , then:For any function symmetric w.r.t. , its integral on is :as is symmetric w.r.t. :substituting the observed variable back, then:thus the density function of distribution isexpectation of would be:
according to Variance > Transformation, variation of would be the same, which is .In conclusion, with the assumption of error, the distribution of sample would be described as:where: is the expectation and is the variance.

Quant

Mon, 30 Mar 2026 19:13:34 GMT

2025-12-29 深圳北站

Mon, 30 Mar 2026 19:13:08 GMT

地铁换乘： 6号线在高架第二层，高架共有三层，出站口在第一层，坐升降机比较慢，因为需要上顶层接人 6号线大约8分钟一趟，5号车厢近出站通道出地铁后至A18检票口需要15分钟左右，主要耗时为站外换乘时间，需通过拥挤的电梯，无站内换乘路线。情况：只带了一个包，客运低峰期安检松弛，水没有检测，身上检查志愿者大妈随便扫了两下就过了。候车休息地： A/B区星巴克、肯德基、瑞幸都有对等复制（B区星巴克在一楼），瑞幸很贵，28一杯，跟星巴克差不多了。 A9/B9中间有一个圆形阅读区，有很多插座可以免费充电时间规划：地铁导航到6号线后，至少预留15分钟用于站外换乘和安检，带多个行李或客运高峰需要增加预留时间。检票在列车开出前十五分钟前开始，4分钟前结束因此至少要在15+15分钟前到达地铁站。

Desktop

Mon, 30 Mar 2026 19:12:20 GMT

System-wise ECC(Error-Correcting Code) Support: For mission-critical systems 找小程序模拟配电脑工作站：完整ECC

Network Science

Mon, 30 Mar 2026 19:03:09 GMT

Type of network, classified by randomness: Regular Network Random Network In between? properties: degree of network: high when regular, drop immediately when there exists are some "short cuts" (by increasing randomness) measure of clustering of network: high when regular, drop slowly as randomness increase why there's hub?staring point: all large network is generated and build from small to large. network growing process: node preference: nodes prefer joining popular nodes. Single-round gaming: No matter what option the opponent chooses, you should always choose N. (non-cooperative)But for multi-round gaming: the opponent will observe you and if you keep choosing P, the expected utility of two options are:as long as your probability of choosing cooperate is , the opponent would choose cooperate two; and so as you, so the trust built, so both players would end up with cooperating forever.

iPhone

Mon, 30 Mar 2026 03:27:07 GMT

Best strategy: You can use iPhone with 2700 RMB/iPhone and change for a better one every 2 years!策略就是保持落后两代，两年一换，每次都买pro机型的第二档存储实操就是买99新二手，贴膜带套换官方电池好好保养，过两年当99新卖掉，再换新的落后两代的二手结果就是省40%+，2700爽用iPhonebuy newest model Pro with second level storage from apple store with full price, and ignore damage risk.iPhone 14 Pro purchased at 2023.3.16 buying costs: 8899 fix screen: 280 fix motherboard: 500 fix battery, camera: 500 Phone case, screen protector: 100 assume the life cycle is around 4 years, then the total cost is: cost/day = best iPhone purchasing strategy2-Gens late, second handed, change every two years. current strategy: Naive(history): change every 4 years, brand new, latest, second level storage: per day: , total: 10279 RMB Continued(if continue with this plan): iPhone 17 Pro 512G: 10999 Accessories costs: phone case + screen protector (4 years): 160 smart strategy: buy iPhone 15 Pro 256G (Second Handed): 3920 RMB sell iPhone 13 Pro 256G (Second Handed): 2100 RMB difference: 1820 RMB Accessories costs: (Phone case + screen protector) * 2: 160 RMB potential battery costs (official): 809 RMB way more cheaper!!!!📱 Naive Strategy Phone cost: ¥8,899.00 Repair costs: ¥1,280.00 Accessories: ¥100.00 ───────────────────────────── TOTAL: ¥10,279.00 Per day: ¥7.0404 Per year: ¥2,569.75📱 Current Plan (Continued) Phone cost: ¥10,999.00 Repair costs: ¥969.00 Accessories: ¥160.00 ───────────────────────────── TOTAL: ¥12,128 Per day: ¥8.3068 Per year: ¥3,032.00📱 Smart Strategy (2-year view) Phone cost: ¥1,820.00 Repair costs: ¥809.00 Accessories: ¥160.00 ───────────────────────────── TOTAL: ¥2,789.00 Per day: ¥3.8205 Per year: ¥1,394.50💰 Smart vs Naive: Daily savings: ¥3.2199 4-year savings: ¥4,701.00 (45.7%)💰 Smart vs Current Plan: Daily savings: ¥4.4863 4-year savings: ¥6,550.00 (54.0%)import pandas as pd from dataclasses import dataclass SMARTSTR_NEW_PHONE_PRICE = 3920 SMARTSTR_OLD_PHONE_PRICE = 2100 @dataclass class Strategy: name: str phone_cost: float repair_costs: float accessories_cost: float lifecycle_years: float description: str def total_cost(self) -> float: return self.phone_cost + self.repair_costs + self.accessories_cost def cost_per_day(self) -> float: return self.total_cost() / (self.lifecycle_years * 365) def cost_per_year(self) -> float: return self.cost_per_day() * 365 def calculate_strategies(): # Strategy 1: Naive - Buy newest Pro, full price, 4 year lifecycle naive = Strategy( name="Naive Strategy", phone_cost=8899, repair_costs=280 + 500 + 500, # screen + motherboard + battery/camera accessories_cost=100, lifecycle_years=4, description="Buy newest Pro model, full price, keep 4 years" ) # Strategy 2: Current Plan Continuation (extrapolated) current_plan = Strategy( name="Current Plan (Continued)", phone_cost=10999, # iPhone 17 Pro 512GB repair_costs=969, accessories_cost=160, lifecycle_years=4, description="Continue current pattern: latest model, 4 years" ) # Strategy 3: Smart - 2-gen late, second hand, 2-year cycles # Two cycles over 4 years: iPhone 13 Pro (2100) → iPhone 15 Pro (3920) smart_2yr = Strategy( name="Smart Strategy (2-year view)", phone_cost=(SMARTSTR_NEW_PHONE_PRICE-SMARTSTR_OLD_PHONE_PRICE), repair_costs=809, # one battery replacement accessories_cost=160, # 2 sets of accessories lifecycle_years=2, description="2-gen late, second hand, swap every 2 years" ) return naive, current_plan, smart_2yr def print_comparison(): naive, current, smart = calculate_strategies() strategies = [naive, current, smart] print("=" * 80) print("iPhone Purchasing Strategy Comparison") print("=" * 80) for s in strategies: print(f"\n📱 {s.name}") print(f" Phone cost: ¥{s.phone_cost:,.2f}") print(f" Repair costs: ¥{s.repair_costs:,.2f}") print(f" Accessories: ¥{s.accessories_cost:,.2f}") print(f" ─────────────────────────────") print(f" TOTAL: ¥{s.total_cost():,.2f}") print(f" Per day: ¥{s.cost_per_day():.4f}") print(f" Per year: ¥{s.cost_per_year():,.2f}") # Savings analysis print("\n" + "=" * 80) print("SAVINGS ANALYSIS") print("=" * 80) # vs Naive save_naive = (naive.cost_per_year() - smart.cost_per_year()) * naive.lifecycle_years print(f"\n💰 Smart vs Naive:") print(f" Daily savings: ¥{naive.cost_per_day() - smart.cost_per_day():.4f}") print(f" 4-year savings: ¥{save_naive:,.2f} ({save_naive/naive.total_cost()*100:.1f}%)") # vs Current save_current = (current.cost_per_year() - smart.cost_per_year()) * current.lifecycle_years print(f"\n💰 Smart vs Current Plan:") print(f" Daily savings: ¥{current.cost_per_day() - smart.cost_per_day():.4f}") print(f" 4-year savings: ¥{save_current:,.2f} ({save_current/current.total_cost()*100:.1f}%)") return strategies strategies = print_comparison() Run

Value Estimation

Sat, 28 Mar 2026 23:51:32 GMT

Price/Earning-to-Growth Ratio P/E: Price/Earning Ratio G: Annual EPS Growth Ratio, EPS = Earning per share

Fund

Sat, 28 Mar 2026 23:14:29 GMT

Central Limit Theorem

Sat, 28 Mar 2026 21:26:54 GMT

Gaussian Distribution
while Law of Large Number saids the sample means almost surely converges to the population's mean, CLT saids the distribution of the -th sample mean is a Gaussian distribution controlled by the population's expectation and variance
according to the sum rules of Expectation > Sum Rule and Variance > Sum Rule, for i.i.d. ,so to standardize sample mean, subtract and divide by and this is equivalent to summing up the standardized samples and dividing by :Characteristics FunctionWhy it's called MGF: Expanding as Maclaurin Series take -th derivatives assign to , you got the -th moment Given independencies of and , we have the "sum rule" of MGF
Expectation > LOTUS is used here.MGF is also called characteristic function because the mapping from a function to its MGF is a one to one function.Linearity leads to uniqueness:Linearity with linearity, same transformation implies is one of the root of the :for Laplace Transform, since the set of function forms a complete basis, for any to make hold, It must have that complete basis the set of function forms a complete basis. You cannot be "perpendicular" to every axis unless you have no "length". MGFs uses exponentials as basis Taylor Series uses polynomials as basis The Fourier Inversiontransforming The Scaling Rule:approach: the MGF of the standardized sample mean converges to the MGF of the standardized Gaussian the standardized sample mean converges to the standardized Gaussian the sample mean converges to the Gaussian, parameterized by the population's and MGF of standardized sample means:

Law of Large Number

Sat, 28 Mar 2026 21:21:08 GMT

probability of being an outlier has an upper-boundif outliers are identified by , the probability is bounded by :mean of i.i.d. random variablehas the properties: it’s expectation is: It’s variance is: by Chebyshev’s inequality,as , converges to in probability.the probability of being an outlier would be very low.Why it's called "weak": it converges in probability, which is weaker than almost sure converges, stated by SLLNthe sample mean definitely converges to the expectation as growths.

Untitled

Sat, 28 Mar 2026 21:07:04 GMT

Drag from below or double clickSpace + Drag to pan⌘ + Scroll to zoom

Pasted image 20260329044841

Sat, 28 Mar 2026 20:48:41 GMT

Pasted image 20260329044058

Sat, 28 Mar 2026 20:40:58 GMT

Pasted image 20260329033153

Sat, 28 Mar 2026 19:31:53 GMT

Wealth Principles

Sat, 28 Mar 2026 19:24:43 GMT

Real wealth is freedom, investment, savings, not possessions, appearance, instagram posts.with savings, you has the opportunity to invest, making income for you automatically. anti-fragile, bring you away from crisis caused by job lost, desease. To have savings, you need to: earn more: don't trade off your health and family! that's just trade one problem for another problem invest your brain be professional be reliable be the top 1% live at low cost: the income doesn't matter, the difference matters you have more free time anti-fragile make proper decision. track your expenses, classifying them, identify the impulse purchases, and optimize.make saving as an non-negotiable expense. Goal: fight for freedom, financial independence Battle field: your spending, your income, your investment. Your soldier: you every single yuan/dollar. spending every dollar is killing your soldier, don't do that for unreasonable reasons!

trading rule

Sat, 28 Mar 2026 19:12:27 GMT

A trader's policies are generated from a transformation of: it's asset holdings available cash risk appetite information set Reflexivity: The act of trading itself changes the environment No perfect policy: you always has a chance to make mistakes Prices are discrete, noisy data. Price changes are random variable change overtime. It's the consensus of buyers and sellers. It's the summarization of behavior of all existing and potential traders. If you know exactly how each trader behave, you know what the next tick price is. Predicting every trader is impossible. Design the arch with agent? maybe we can use LLM to upgrade the definition and define a network for training to utilize existing data? stop loss: %33 loss means 50% profits next time! take profits: the minimum profits you are expecting for. Useful for filtering noise. Profitability: an strategy that make no profit is uselessSharp Ratio - standardized Profitability: Sortino Ratio - consider downside deviation only. : Minimum Acceptable Return. DR: standard deviation of only the negative returns (specifically, the returns where ) Risk-Adjustability Sharp Ratio Sortino Ratio Drawdown & Recovery Maximum Drawdown (MDD) Recovery Time Slippage Sensitivity If you know nothing about the price of the asset.the price is bounded, , and current price If you invest percentage of your cash immediately, and you are very unlucky, it drops below your cost price before reaching your take-profit goal price. As time change, you need to consider two states changing over time: price & value. If the price exceeds the value, hold; if not, sell.dividend yearly.

trade tools

Fri, 27 Mar 2026 20:12:56 GMT

is it free and opensouce: yes, MIT. PyPI available. is it free and opensource?: yes, MIT. PyPI available

Support Vector Machine

Thu, 26 Mar 2026 09:46:16 GMT

Linear Classifier: distance of x to the boundaryPS 9-1Note: scale of (which is ) does not affect this Euclidean distanceDefine "margin" - distance from boundary to the closest point in training set (assume linearly separable)Hard-Margin SVMSoft-Margin SVMIdea: maximize the margin - separation between boundary & points. margin determines complexity in perceptron training points are random samples - leave margin to be safe from uncertainty ... is uncertain estimate Formulation - fix the numerator2 possible cases: on boundary: is in feasible region: at a min (or in the above egh) Combine the cases:find the stationary poinrt:KKT(Karush-Kuhn-Tucker) conditionsturn one optimization problem into anotherLagrangian: combine the goal and the constraints constraints: that point is a support vector that point is not a support vector and doesn't matter. Goal: minimize the weights. suppose optimal is known, then minimize Note: So at min : (the minimum)Define maximizing could yield weak duality theorem: Strong duality theorem: if is convex feasible region is convex (feasible region is not degenerate) then Solving the dual problem is equivalent to solving the original primal problemlet be the Lagrange multiplier for th constraint i.e. Largrangian: Dual function:SVM Dual Problem: Given ,

Problem Set 9

Thu, 26 Mar 2026 08:14:32 GMT

Convergence of Random Variable

Thu, 26 Mar 2026 07:52:10 GMT

依分布收敛：，依概率收敛：，几乎处处收敛：， written as:the distribution converges:written as: is close to with greater and greater probability.given a large , the probability of being outlier will be very small, but still has a chance.Written as:The probability of the random sequence convergence is one. It guarantees that any sample sequence would definitely converges.收敛于分布的事件概率为一说的是随机变量序列收敛于某个确定实数的概率为1。

Problem Set 8

Thu, 26 Mar 2026 07:33:42 GMT

Linear Regression

Thu, 26 Mar 2026 07:30:19 GMT

ML

Thu, 26 Mar 2026 07:17:41 GMT

LOTUS (Law of the Unconscious Statistician) Theorem:It's all about sign cancellingthe probability density function is defined as:given , to transform to probability about , slice .if is monotonically decrease in if is monotonically increase in :take derivative of , apply chain rule:and both cases generate the same expression:So for piecewise monotonic : this leads to LOTUS:the potential negative sign of also cancel with the potential flipping of integral limit caused by substituting variables in definite integral：Proof of definition of applying LOTUS:Thus:Proof of Definition of applying LOTUS, the direct consequence of the definition of is:so:thus:Let's say , and Proof of :Definition of expectation on a vector/matrix of random variables:in the case of vector or matrix, addition is element-wise, so it's easy to know:When it comes to vector-multiplication, assume is a vector, and is a random variable, then the -th element of vector is:so .When it comes to matrix-multiplication, applying linearity of expectation in scalar space:thus:The definition of is:This match the outer product rule:applying the rule ：applying the rule with the results above, we have:multivariate version of LOTUS (continuous case):independency implies the joint probability density is the product of the marginal probability density:By applying both LOTUS:so for , it will be zero if and are independent:Relationship uncorrelated: uncorrelated independence, independence uncorrelated and are uncorrelated:marginal distribution:it's apparently that , but , at , so and are not independent.definition of the conditional expectation:by law of total expectation: is proven above, so and are uncorrelated.Or called Iterated Expectation: take expectation on conditioned variables and then conditions generate the expectation on the joint distribution.For any :when and are independent, we have , so:Quadratic Form 二次型Multivariate Gaussian: a probability density over real vectors (join probability)With a diagonal convariance matrix:The Mahalanobis distance would be:The determinant of would be:So the density becomes the product of univariate gaussian, indicating that a diagonal covariant matrix make sure the variables are independent:with , the gaussian is squeezed in y dimension. with , all components are i.i.d.
eigen-decomposelet Mahalanobis distance could be:: centering : rotate and scale , so they are independent and normalized.Eigenvalue and Eigenvector: , Since is a solution for , to get additional values, its determinant should be zero. Apply this idea to : Eigenvector define the direction, and small eigenvalue center the distribution, while large eigenvalue sparse the distribution (more covariance).
transform:A Gaussian distribution can be identified:Now process the rest of the expressionthe remaining part of exponent part of is times:This could also be treated as an fixed value Gaussian:And the Remaining factor is:which match the form of Gaussian together with the exponential part. So the overall distribution expression is:The resulting scaled Gaussian means this could not hold unless the scaling factor is 1expand the exponent term, and then complete the square:so the Gaussian part of the product is:leaving the coefficient as:to figure out , apply Woodbury matrix identity to simplify the nested inverse. Simply applying Woodbury Identity introduce two terms, so we make further transformation:so the last term of e should be:Combining the two terms in would have:also:observe the second term, we apply Exchange Identity here:and since is symmetric, so and the coefficient is:
This transformation procedure use a lot rules of DeterminantThe coefficient and the redundant exponential term also form a fixed-value Gaussian:So the product of multivariable Gaussian is:Inverting after low rank modification with low costMotivation: if you've inverted a large matrix , inverting it after added a low-rank update should not have to start from scratch : the core update, : bring the update into the space of , How to construct the invert subtract the correction: adding something to the original matrix usually "shrink" the inverse To cancel the last to terms, we can build correction with a sandwich structure to match the terms of : there is a in the back we don't want to invert , even it's invertible, so there should be a in the form let's call the remaining part So the Correction term should have the form: so we have the last term as: The only unknown matrix here is , compare with , the only difference is the last term has a inside, making this so we are able to cancel these two terms: So the correction term should bethe inverse should belike:Notice that we assume is invertible, which does not always hold. But surprisingly this form also hold when is not invertible. It can be verified by the same process above.Warning Do not mix the concept of Correlation and Covariance correlation between distribution is defined as:Using problem 1.8, The correlation between two Gaussian distributions is:trace is defined as the sum of the diagonaleigenvalues are roots of the characteristic function:
the latter equivalence is guarantee by Polynomials > Identity Theorem for Polynomialslet , then:Problem Set 1of 7CS5487 Problem Set 1Probability Theory and Linear Algebra ReviewAntoni ChanDepartment of Computer ScienceCity University of Hong KongProbability TheoryProblem 1.1 Linear Transformation of a Random VariableLet x be a random variable on R, and a, b ∈ R. Let y = ax + b be the linear transformation of x.Show the following properties:E[y] = aE[x] + b, (1.1)var(y) = a2var(x). (1.2)Now, let x be a vector r.v. on Rd, and A ∈ Rm×d, b ∈ Rm. Let y = Ax + b be the lineartransformation of x. Show the following properties:E[y] = AE[x] + b, (1.3)cov(y) = Acov(x)AT . (1.4). . . . . . . . .Problem 1.2 Properties of IndependenceLet x and y be statistically independent random variables (x ⊥ y). Show the following properties:E[xy] = E[x]E[y], (1.5)cov(x, y) = 0. (1.6). . . . . . . . .Problem 1.3 Uncorrelated vs IndependenceTwo random variables x and y are said to be uncorrelated if their covariance is 0, i.e., cov(x, y) = 0.For statistically independent random variables, their covariance is always 0 (see Problem 1.2), andhence independent random variables are always uncorrelated. However, the converse is generallynot true; uncorrelated random variables are not necessarily independent.Consider the following example. Let the pair of random variables (x, y) take values (1, 0),(0, 1), (−1, 0), and (0, −1), each with equal probability (1/4).(a) Show that cov(x, y) = 0, and hence x and y are uncorrelated.(b) Calculate the marginal distributions, p(x) and p(y). Show that the p(x, y) 6 = p(x)p(y) and thusx and y are not independent.1(c) Now consider a more general example. Assume that x and y satisfyE[x|y] = E[x], (1.7)i.e., the mean of x is the same regardless of whether y is known or not (the above examplesatisﬁes this property). Show that x and y are uncorrelated.. . . . . . . . .Problem 1.4 Sum of Random VariablesLet x and y be random variables (possibly dependent), show the following property:E[x + y] = E[x] + E[y]. (1.8)Furthermore, if x and y are statistically independent (x ⊥ y), show thatvar(x + y) = var(x) + var(y). (1.9)However, in general this is not the case when x and y are dependent.. . . . . . . . .Problem 1.5 Expectation of an Indicator VariableLet x be an indicator variable on {0, 1}. Show thatE[x] = p(x = 1), (1.10)var(x) = p(x = 0)p(x = 1). (1.11). . . . . . . . .Problem 1.6 Multivariate GaussianThe multivariate Gaussian is a probability density over real vectors, x =x1...xd ∈ Rd, which isparameterized by a mean vector µ ∈ Rd and a covariance matrix Σ ∈ Sd+ (i.e., a d-dimensionalpositive-deﬁnite symmetric matrix). The density function isp(x) = N (x|µ, Σ) = 1(2π)d/2 |Σ|1/2 e− 12 ‖x−µ‖2Σ , (1.12)where |Σ| is the determinant of Σ, and‖x − µ‖2Σ = (x − µ)T Σ−1(x − µ) (1.13)is the Mahalanobis distance. In this problem, we will look at how diﬀerent covariance matricesaﬀect the shape of the density.First, consider the case where Σ is a diagonal matrix, i.e., the oﬀ-diagonal entries are 0,Σ =σ21 00 . . .σ2d . (1.14)
2

PS-1.pdfparameter estimationIdea of MLE: "likelihood" is the probability of the dataset appears, so the parameter that maximizes the likelihood might be the true parameter.Step 1, make assumptions: assume the distribution is ..., and the PDF/PMF is Now estimate the parameters by maximizing the likelihood.Likelihood: the product of the possibility / possibility density of the data / dataset.Given a sample dataset , the likelihood is a function of :Since log is monotone, sometimes we use log-likelihood:Find the that maximize the possibility of occurs:the optimized is converge in probability to the true parameter 估计误差的分布会趋近于正态分布The MLE of is Define the likelihood of as the supremum of with :And the optimized should be the one that makes be supremum of , which is the supremum of :and the likelihood of is:which means , so can be Maximum Likelihood EstimationPrior: the initialized parameter for the assumed distribution of parameterGiven a dataset, adjust the distribution of parameter. Find the parameter that maximizes the probability of the parameter. It can be written as:Maximum A PosterioriNon-Parametric Estimationcenter the kernel to the data points, and take average.The estimated distribution of the data can be used to: bootstrap an algorithm help make decision Kernel Density Estimation E-step: estimating missing data with current parameters M-step: maximizing likelihood to update parameters Expectation-Maximization Algorithm如果有一个约束条件，要求求出满足约束条件的的极值点，那么把约束条件作为一个项构造一个连续的函数直观意义是决定了最优的点集（惩罚为0），和在这些落在其他点的时候的惩罚力度。拉格朗日乘数用控制限制力度。求驻点：让梯度为0求驻点，这样解出来的驻点既满足约束条件，且点上的的梯度和约束的梯度共线。Lagrange Multiplieroptimizatio techniques
Poisson Process > Poisson DistributionPoisson Distribution: the distribution of random variable that measure how many events happen given a period of time with a parameter called arrival rate.Likelihood function of given samples MLE of :find the stationary point:the second derivative of this function is so the stationary point is: , which is the sample mean.Clark's data:, the sample mean is:from scipy.stats import poisson n = 576 mu = 535/n v_f = [] v_p = [] for i in range(0, 6): p = poisson.pmf(i, mu) v_p.append(p) f = p * n v_f.append(f) print("Probabilities: ") for i, p in enumerate(v_p): print(f"P(X={i}) = {p:.3}") print("Expected Counts: ") for i, f in enumerate(v_f): print(f"ec(X={i}) = {f:.3}") the outputs are:Probabilities: P(X=0) = 0.395 P(X=1) = 0.367 P(X=2) = 0.17 P(X=3) = 0.0528 P(X=4) = 0.0122 P(X=5) = 0.00228 Expected Counts: ec(X=0) = 2.28e+02 ec(X=1) = 2.11e+02 ec(X=2) = 98.1 ec(X=3) = 30.4 ec(X=4) = 7.06 ec(X=5) = 1.31 which remarkably close to the actual data, so my conclusion is that the assumption has great chance to be true.Problem Set 2

Derivatives

Thu, 26 Mar 2026 01:04:20 GMT

Proof:ProofProofProofProofBy applying Limits > Composition Rule:if is determined by , () then the differential of y () is . if is a free variable, then is an independent variable, representing a change of .
the little o here represent an infinitesimal which is strictly higher order of . little oif is differentiable then it's derivableif is derivable then it's differentiable:thus, derivable is equivalent to differentiable.Total differentiable，if：if all partial derivatives are continuous, then total differentiablerate of change along this line/direction :gradient is a partial derivative vectorHessian Matrix Elements of Hessian Matrix is the second derivative of -th then -th input. Jacobian matrix Jacobian matrix is the partial derivative matrix for a multi-input-output function. The elementh is the the derivative of -th output w.r.t -th input

Geometry

Thu, 26 Mar 2026 00:48:04 GMT

tangent hyper plane on the plane is defined as:

Roadmap

Wed, 25 Mar 2026 23:42:41 GMT

DedekindDefinitionSupremePrincipleMonotoneConvergenceTheoremNestedIntervalTheoremCauchyConvergenceCriteriaLimits (Definition of Converge)Riemann Sum(Definition of Integral)FiniteCoveringTheoremBolzano-WeierstrassTheoremDerivative(Rate of Change)ContinuityDifferentiableIntermediateValueTheoremRolle's MVTMVT forIntegralLagrange MVTCauchy MVTExtreme ValueTheoremBoundedTheoremFermat's LemmaTheorems for ContinuityFundamentalTheorem ofCalculusderivableSqueezeTheoremSign PreservingPropertyHeine's TheoremLocalBoundednessReal Number TheoremsOrder Propertyof LimitsPartial DerivativeTotal DifferientialGradientPartialDifferientialHessian MatrixJacobian MatrixTotal DerivativeDirectionalDerivate

Limits

Wed, 25 Mar 2026 23:25:49 GMT

If by selecting a small punctuated neighborhood of , a variable determined by can close enough to a fix value in any level, then that variable is converge to that fix value.is saying:the limit of of as approaches it means that the value of the function can be made arbitrary close to by choosing x close to is saying:人话：所有路径都收敛于一个值，那么就在该区域无死角收敛了。证明：如果函数收敛，自然每一条路径都收敛如果函数不收敛，那么能够构造不收敛的坏数列 Sign-Preserving Property If a function has a limit on , then there's a punctured neighborhood of , one that set, keeps the same sign with the limit Suppose Set , then , , So as when Squeeze Theorem if , and and has same limitation on , then also has limitation on Order Property of Limits If , and , then if , then for , :which contradicts with . So Local Boundness If , is bounded in some punctured neighborhood of swappable to scalar productProofwith , , thus:distributive to function additionProof: (triangular Inequality)with , with , Thusdistributive to function productProof: involve a trick of constructing and be aware that is locally bounded due to the Local Boundness : with , , thus with , so, with , distributive to function devisionProofkey steps:step1: Proof
step2: use distributive to function product:Detail ProofThis proof is divided into two steps: proof and reverse triangular inequality product rule with , , which saids: with , Thus:Thus:applying the product rule:composition rule if is continuous, then

Real Number

Wed, 25 Mar 2026 23:03:19 GMT

Mathematical Analysis

Wed, 25 Mar 2026 22:54:20 GMT

Real Number: Dedekind definition: a real number is defined by a dividend of quotient number set Supremum principle: a bounded set of real number has a supremum Monotone continuous theorem: a bounded monotone function has a limit Nested interval theorem: a set of nested interval defines a unique number Bolzano-Weierstrass theorem: a bounded infinite sequence has a limit Cauchy Convergence Criterion: a variable converges to some fixed value in a punctuated neighborhood is equivalent to the difference of any two variables converges to 0 -- you don't need to know the fixed value. Limits: the stability/convergence on a fix value in the punctuated neighborhood Derivative: Rate of Change Differential: linear representation of change. Indefinite Integrals: the antiderivative set definite integrals: Riemann Integrals - limit of the sum of "areas" Continuity: the prerequisite of derivable properties on a closed interval: EVT, IVT, MVT FTC: integrals of a function on a closed interval is the difference of its antiderivative on the endpoints. 数列极限：对于A的任意邻域，当n取足够大时，数列的项总在A的邻域内，则称数列收敛于其极限A 函数极限：对于A的任意邻域，当原像距离a足够近时，函数的值总在A的邻域内，则称函数收敛于其极限A数列极限：数列的极限：对任意，若总存在，令，则函数极限的定义极限定义的人话极限是函数值在某点附近的无限逼近值，简单来说就是“对任意小的值差异要求都有定义域限制存在”。在的某个去心领域有定义，如果存在常数，使得对任意，使得当，有，则无穷远极限：差不多，注意区分单项正、负和双向极限人话：所有路径都收敛于一个值，那么就在该区域无死角收敛了。如果函数收敛，自然每一条路径都收敛如果函数不收敛，那么能够构造不收敛的坏数列保序性（夹逼定理）：若有函数，都有极限且相同，则的极限也和一样，因为在任何情况下都有和中的一个的距离的绝对值比距离的绝对值大。保号性、唯一性、局部有界：都可以用取足够小的证明保号性：函数推极限：若函数大于等于0，极限若小于0，由保号性，函数应该小于0，矛盾。因此函数若大于等于0，极限也大于等于0. 极限算子与实数加减乘除、复合函数运算的法则由定义可证，当和在附近都有极限，则求极限对加减法、数乘（scalar product）有分配律因此是一个线性运算对乘法也有分配律当不为0的时候对除法也有分配律当然前提是这些函数在a有极限应用极限定义找的思路能得出复合函数的极限：需要在的某去心邻域不等于。用处是对一个复杂的函数若能提取复合函数结构，则由内往外求极限，如果能求出就是有，求不出再想其他方法。实数幂整数幂被定义成n次乘，有理数幂被定义成整数幂后开方，无理数幂则被定义为用有理数幂逼近（例如n位十进制近似值次幂的极限），可以证明极限存在且唯一，且与逼近序列的选择无关。 0-0型：等价无穷小，可以直接替换求极限中的项，因为利用半角公式变形得，容易发现，所以型：在某点连续、可导、可微都是某表达式在该点有极限：连续是有极限且等于函数值可导是有极限可微是使得成立的常数A存在基本初等函数：幂函数，指数函数，对数函数，三角函数，反三角函数初等函数：基本初等函数+常数经过有限次四则运算+复合运算构成初等函数在它的定义区间内连续函数在一点连续区间上连续：在开区间上所有点都连续，闭区间上连续还需满足左/右端点上的右/左连续。IVT 介值定理对于在上连续的函数，构建包含的区间套，应用区间套定理，区间套确定唯一实数，而，由海涅定理以及函数在区间上的连续性，用聚点定理证明。如果无界，二分区间，每次选取无界区间的一点，构造数列：该点的选取需使得最终构造出的数列发散。但由聚点定理和海涅定理，矛盾。因此有界。最值定理应用有界性定理，有界，应用确界定理，有上确界。假设函数在区间内不能取到上确界，则也是在同一闭区间连续函数，则也有上确界，即：，即，不是上确界，矛盾。因此假设的反面成立，即在区间内能取到上确界，即有最值为上确界。对于最大值点：左变化速率是恒正的，因此左导数是非负的；同理右导数是非正的，又函数是在最大值点有极限（连续），因此导数在最大值点也有极限，因此左导数=右导数=0可微一元函数中可微等价于可导，因此总能用微分形式表示导数多元可微微分因此对于一元函数的导数有莱布尼茨形式：反函数求导法则两边对x求导复合函数求导法则和Leibniz Form对定义在的不定积分的不同原函数，构造它们的差由拉格朗日中值定理，任取，，因此导数处处相等则函数值处处相等，为一常数。换元，基于复合求导法则第一类换元：减法第二类换元：加法乘法求导法则分部积分——基于乘法求导法则定积分的定义：定积分是近似的极限。定积分被定义为一个极限，一个将积分区间分割n份、取小区间内一函数值近似函数变化、然后求和的极限，且这个极限应当对任何分法、任何取法都相同（这使得构造积分变得容易）。分为份，最大份的 Riemann sum: ，代表了任意分法、任意取法，可以灵活得构造Riemann sum并套用结论。Darboux sums: 达布上和&达布下和存在（由单调有界原理），又连续函数在闭区间一致连续，因此达布上和&下和一致收敛于同一个值，因此任意在闭区间上的连续函数在盖区间都可积，黎曼可积。Note 因此对任意连续函数，都能定义在连续的函数，有最大最小值（否则可以在区间内构造路径使得函数值不收敛），又由极限的保号性，，而由介值定理，存在一点，使得（由区间套定理、海涅定理和连续性得出）连续的函数总有原函数：对于任何积分，先找原函数，其中一个是：Vector FieldStokes' FormulaGauss FormulaJacobian Matrix:

Distribution

Wed, 25 Mar 2026 21:32:16 GMT

Continuous Functions

Wed, 25 Mar 2026 21:30:26 GMT

if a function is continuous on , it's saying that for any , s.t.:If a function is continuous on a close-interval , then its bounded on that interval.Proof with Bolzano-Weierstrass Theorem.Suppose that: the function boundless on If the function is boundless, loop this process, starts with : bisect the interval with select the boundless half or select a point making this half boundless to build a sequence continue this process with And finally we have a infinite bounded sequence and an unbounded sequence s.t.:
converge to some number (Bolzano Weierstrass Theorem)
is unbounded, which contradicts to (Heine Theorem) So the function is bounded on IVT If a function is continuous on a close-interval , then for any intermediate value , there's some value making that ProofBuild nested intervals with this process: bisect the interval with if , stop. select the half making continue this process
So we get a nested interval , which locate exactly one real number , and (Nested Interval Theorem). As the function is continuous on , with Heine's Theorem: So we have another set of nested intervals that locate a unique number, which can only be since . (Nested Interval Theorem)Extreme Value Theorem If a function is continuous on , then there's a maxima and minima in .
As Boundedness Theorem said, the function is bounded, so the function has a supremum and infimum, w.r.t. Dedekind Completeness Theorem. If the supremum is .Suppose . is also a continuous function, thus it's also bounded. Suppose is an upper bound of on :so is also an upper bound for on , which contradicts to the setting that is the supremum. So there must be some value s.t. .Fermat's Lemma derivative on maxima / minima must be zero Proof:Suppose is a maxima.It said that:Rolle's MVT if function is continuous on and , then:
A function is continuous on , then with Extreme Value Theorem, there must be a maxima , and then with Fermat's Lemma, Lagrange MVT If function is continuous on , then: , which saids the derivative on some intermediate point concludes the change of endpoint. Proof:
build a flat function containing to apply Rolle's MVT:So that So with Rolle's MVT, Cauchy MVT if function are both continuous on , then: construct a function whose derivative is in the form that fit the theorem:
so by Rolle's MVT:MVT for Integrals If function is continuous on , then the mean integral is an intermediate value of
is continuous on , so with Boundedness Theorem, is bounded:
then with IVT:if is continuous at accumulation function is one of the antiderivativeIntegration is difference:Proof:
Prove it by MVT for Integrals:So any original function of , denoted by , can be represented by adding some constant to :substituting with , we know the constant is substituting with , then the integration of from to is the difference of the value of the endpoints of any antiderivative : it must be continuous to be derivable, or equivalently, differentiable.if is differentiable, then is also an infinitesimal -- which implies that is continuous.

Asymptotic Notations

Wed, 25 Mar 2026 21:08:38 GMT

Big : The Ceiling means growth rate is upper-bounded by This saids: there exists a constant and a constant , as , , or, is bounded by Prove that as , , which saids , it's bounded by Big Theta Notation: Same Growth Rate implies has exactly the same growth rate as saids that , as long as little : strict ceiling means becomes insignificant compared to or rules for little o: multiplying an constant doesn't change its order: multiplying an infinitesimal make it an even higher order infinitesimal: is or adding a higher order of infinitesimal doesn't change its order: if , then little means strict, big means no ... than is "o" means upper-bound, is "omega" means lower-bound, is "theta" is the same order as

Transformation

Fri, 20 Mar 2026 00:12:20 GMT

SVD

Thu, 19 Mar 2026 11:07:16 GMT

rotation:Scaling:rotation back:

Internship

Wed, 18 Mar 2026 07:39:17 GMT

Internship Graduate TraineeDeloitte Digital consulting 457,000 people, 850 officesProject Manager Business AnalystFunctional Consultant Growing PathFunctional Analyst: Communication Functional Consultant: Problem Solving Senior Functional Consultant: Manage Functional Team Manager: Manage Project DeliveryManulife Canada

Variance

Wed, 18 Mar 2026 05:48:06 GMT

Tayler Series

Tue, 17 Mar 2026 18:42:16 GMT

Taylor Series: simulating utilizing its behavior on Fundamental Theorem of CalculusIntegrate by parts Integration skills > integration by partsIntegrate by parts:Suspect that:Inductive Reasoning
Maclaurin Series is simply Tayler Series center at :
By MVT for IntegralsSo we can derived Lagrange Remainder from the Integral Remainder:knowledge: So we know well about around , and thus can compute around .from .factorial import factorial DERIVATIVES = [0, 1, 0, -1] pi = 3.141592653589793 def sin(x: float) -> float: if x < 0: return -sin(-x) if x > 2 * pi: return sin(x % (2 * pi)) res = 0 for n in range(100): res += DERIVATIVES[n % 4] * (x ** n) / factorial(n) return res Run

Fourier Series

Tue, 17 Mar 2026 18:12:58 GMT

Series

Tue, 17 Mar 2026 18:09:51 GMT

Exponential and Logarithmic Functions

Tue, 17 Mar 2026 18:08:46 GMT

推广指对数函数到实数定义因为该变上限积分有以下和有理数指对数相似的性质：再用换底公式来定义，并定义为的反函数，则由以上结论容易有指数函数也符合幂乘的运算法则：因此有：自然底数定义底数使得，即由导数定义：以及重要极限的存在性，因此：称为自然对数的底数

Expectation

Tue, 17 Mar 2026 15:56:53 GMT

Infinitesimals

Mon, 16 Mar 2026 23:55:11 GMT

and are equivalent infinitesimals, written as saids thatoscillating function and should not cross zero "near" , otherwise, the quotient is not defined. As for Product Rule, so the equivalent relation has transitivity:Proof:

Fourier Analysis

Sun, 15 Mar 2026 23:45:06 GMT

How strong the function is on frequency ?integral the signal with :IFT: inverse Fourier Transformsumming frequenciesspectrum: show how much of each frequency is present in a signalTime domain: Shows how the signal changes over time Frequency domain: show how much energy exists at each pitchsampling is multiplying a Dirac comb:Flipping, Sliding, MeasuringCommutativityBy using IFT and FT:

3D Gaussian Splitting

Fri, 13 Mar 2026 13:21:43 GMT

generate new viewpoint with existing imagesPoint cloud Mesh NeRF 3DGSNeRF:

Aliasing and Anti-aliasing

Fri, 13 Mar 2026 12:34:35 GMT

混叠Remove the high frequencyspectrum domainaliasing: due to the spectrum overlapUniform samplingNyquist Theoremsampling theorem is for uniform samplingReconstructingincrease the sampling rate use more samples use higher-order reconstruction The Nyquist-Shannon sampling theoremSpatial aliasingTemporal aliasingClip each polygon against each pixel to form polygon fragments Determine visible fragmentssubpixel samplingstill basically a supersampling algorithm

The Rending Pipeline

Fri, 13 Mar 2026 11:41:31 GMT

Input object in local coordinates word coordinate transformation perspective transformation back-face removal clipping Rasterization Hidden surface removal and shading Output

The Nyquist-Shannoon Sampling Thoerem

Fri, 13 Mar 2026 11:37:53 GMT

The samples of the continuous signal is equivalent to multiply the continuous signal by a Dirac comb:

Ray Tracing

Fri, 13 Mar 2026 11:16:06 GMT

Radiosity for a large surface

Problem Set 6

Wed, 11 Mar 2026 21:24:38 GMT

Bayesian Theorem: Bayes Decision Rule: pick the class with the highest posterior probability.Bayes Risk: incorporate a loss function lowest possible error rate (Bayes Error Rate)(a): This type of loss function is useful when the one type of error is more unacceptable than the other. For example, when a doctor diagnose, it's more worse to tell a sick patient they are healthy than to tell a healthy patient they might be sick.(b): Prior: the distribution of class Likelihood: when the class is , the likelihood that the feature appear.the probability of making an error is obtained by summing up all possible features that would predict wrong:The probability of making one kind of error can be calculated by subtracting the correct rate from the total probability of the specific class:Which makes:To minimize , we should include a point in the region (predicting it as 1) only if:Transform it:So the threshold of the log-likelihood ratio test is compare Likelihood Ratio with the threshold inverse class ratioNeyman-Pearson Lemma: the Likelihood Ratio is the most powerful test possibleLoss Matrix , the cost of deciding class when the true class is Risk of deciding : Risk of deciding : use Bayes' Theorem to swap posterior for likelihood:We decide when the risk of deciding 1 is less than the risk of deciding 0higher loss value raise the threshold of choosing class 1 than 0.

Natural Language Model

Wed, 11 Mar 2026 11:55:56 GMT

TextTokenize the text to make the input more informative.Segment and map the chunks into token_id using vocabulary. special characters: 0 for padding, for sentence stop, [UNK] for unkown Batching/Padding tokenizer transform the text into batches of sentences, and words in sentences are represented by token id, which is mapped from a pre-defined vocabulary.Map each token id (a scalar, or a one-hot vector) into a dense vector.Tip Token IDs are sparse, since it's equivalent to one-hot vectors Pre-trained embedding model: Word2Vec, GloVe End-to-End: upgrade with the back propagation with all params in the model. GloVe: Distribution Hypothesis: words occur in the same context has similar semantics

Markov Decision Process

Tue, 10 Mar 2026 23:44:47 GMT

a set of states , a set of actions transition function the model, the dynamics Markov: given the present state, the future and the past are independent.There are a set of states to transfer, and a set of actions to takes to transfer among states.Taking actions on a specific states might lead to different states , with probability . Transitioning between states by taking action would be assigned with a reward .A policy is a mapping from to . To know which action to take on each state, one should know what action to take when in state .With a policy , given a starting point, there is a deterministic state sequence . To evaluate the policy, we use the concept utility. To focus more on near future, use a discount to discount future rewardsTo find a policy, the key is to average the discounted utility of with probabilities.This is called Bellman Equation.To calculate , we need to search the whole tree.Policy Iteration: Initialize Policy Repeat until the policy stop changing for every states: Policy evaluation: given a policy , follow it and calculate the prob-averaged utility. (Bellman Equation without the operator): update the state values according to current policy. Policy extraction: once you have the value of all states, you update the policy without iterate all actions: extract the policy from the converged value above return the converged policy Value Iteration: Initialize Values Repeat until the value stop changing (converge) for every states: look all utilities with different actions, take the biggest utility. Extract the policy from the converged value. The probability function means that action takes on will leads to different with Probability .stationary preferencesDiscounted utilityInfinite UtilitiesSolutions:Deterministic Policy:Stochastic Policy:An optimal Policy : maximize the expected total discounted reward.Policy Extraction:Q value : the value of taking action in state value : the value of the stateFinding the policy:The optimal policy which action to take for every possible state to maximize the cumulative reward over time.future rewards are less certain than immediate ones, use a discount factor to prioritize sooner gainsBellman Equation is the values of statesRacing Search TreeTime limited: to be the optimal value of if the game ends in more stepsValue Iteration algorithm. expectimax Policy evaluation: calculate utilities for some fixed policy until converges Policy improvement: update policy using one-step look-ahead with resulting converged utilities as future values repeat until converge value iteration, policy iteration policy evaluation policy extraction (one-step lookahead)variations of Bellman updates

NAS

Mon, 09 Mar 2026 16:47:34 GMT

Hard drive test unchecked: it's for HDD, and will waste live of HDDUse Btrfs

交易程序

Mon, 09 Mar 2026 07:08:01 GMT

我想要用能够和AI边聊天他边帮我修改程序，我要求他能够阅读我整个代码仓库，然后能记住我的需求，从而作出正确的修改我的工作环境是VSCode，使用的编程语言是Python，可用的数据接口是Tushare Pro，目前的硬件有一台Macbook Air M4 24G+512G，目前有的AI订阅：Kimi 199档，Copilot，Deepseek有80余额，Qwen有80余额我现在的目标是先建立一个稳定的交易模型开发、回测、实盘验证闭环，要能够确保交易模型没有使用未来函数。交易模型的开发可以是和AI一起开发的，也就是由AI和我一起生成交易模型的逻辑和代码，包括文档。然后我和AI都可以从以往开发的交易模型开发和回测数据中学到东西。

Note Syncing

Sun, 08 Mar 2026 21:57:35 GMT

I started my obsidian, which is synced with iCloud among my devices, on my PC yesterday and when I open the obsidian vault on my MacBook, the files disappear! and I check my PC, shit, they also gone! I finally find part of these notes (mostly recent notes) on the root of iCloud Drive, and take hours to recovery the vault.Sync Obsidian notes among apple devices (mac, iPhone, iPad) and windows PC, so I can edit, read the latest notes on any of these devices.A backup method that prevent iCloud: Work fine on apple devices Not stable on Windows, may cause file loss and arrangement disorder on iCloud, which causes unacceptable loss with high probability. Ugreen NAS Syncing: Not available on Mobile devices (iPhone, iPad) Using Obsidian Git Extension + Github Repository: retrain by Github: only Github is supported, and Github repository storage has 1GB limit to operate properly. security issue: private data is exposed to Github and potential hacker. conflicts are not easy to solve: require git knowledge Not realtime Obsidian sync plan: expensive: $4-$8 per month, pay by year

README

Sun, 08 Mar 2026 17:27:07 GMT

this is my note vault!For obsidian

早餐

Thu, 05 Mar 2026 20:55:43 GMT

温泉蛋

Probability

Thu, 05 Mar 2026 20:40:08 GMT

Note 数学定义的工具色彩更重，直观的定义往往是这种更宽松定义的推论概率是集合函数，规定了一系列性质，方便在上面做代数运算定义：可列可加性: 定义是最小的规则集合，可以推出更多我们能自然理解的规则： : : : 本质上概率就是说我可以用这个函数建立起一套代数系统，方便我对事件做可能性的量化计算。至于怎么建模可能性，和这些个定义和运算方法目前无关，因此才产生了频率和贝叶斯的观点。满足概率定义，因为：所以适用于概率的恒等式也都适用于条件概率乘法公式：条件概率的条件概率：把新条件与到已有条件里：，因为将看作，则有全概率公式：有一组假设是A的分割，即的和事件为A且互斥，则有：贝叶斯定理 A: 肺癌，B：吸烟。现在发现肺癌里吸烟的人很多。如果所有肺癌的人都吸烟、所有人都吸烟，那么吸烟一定导致肺癌？如果所有肺癌的人都吸烟，但是吸烟的并不是所有人，那么吸烟得肺癌率被放大如果肺癌的人吸烟比例不比人群中吸烟的人高，那么吸烟得肺癌率不比普通人的肺癌率高患病阳性比如患病(A)率为患病阳性(TPR)率为不患病阴性率(TNR)为那么根据全概率公式阳性率为但实际上阳性下患病率也就，因此TPR、TNR双高并不代表PPV就高，在样本比较均衡的时候，TPR、TNR高就意味着PPV、NPV双高但是当正负样本的分布极不均衡时：因此如果TPR/FPR不够大，稍微放大一下负样本的数量就会让PPV变小了。所以阳性下患病率而阴性下非患病率被放大了，患病率下降至。FNR低于但是如果复查，检查的FNR、TPR不会变，但是人群变成了阳过一次的人群了，因此 P、N：例子的正负两面 T、N：正确判断、错误判断 TP、TN：正确判断的正例、负例数量 FP、FN：判断为P/N，但判断错误（F），实际为N/P P = TP + FN，N = TN + FP 看真实分类里的预测准确率： TPR (sensitivity敏感性)、TNR(specificity特异性)：TP/P（正确判断的正例占所有正例的比例），TN/N FPR假阳率、FNR假阴率：FP/N，FN/P 看预测分类里的准确率： PPV、NPV：TP/(TP+FP) , TN/(TN+FP) FD(discovery)R, FO(omission)R：FP/(TP+FP) (Precision), FN/(TN+FN) 看全体：accuracy 准确度 Note 真假{T,F} 预测{P,N} 除以 {阳预测，阴预测，阳样本，阴样本} Note 真阳占正样本的比例是真阳率TPR，衡量模型敏感性；预测错误的比例是假阴率FNR，是犯第二类（β）取伪错误的比例真阴占负样本的比例是真阴率TNR，说明模型的特异性，是犯第一类（α）弃真错误的比例真阳占阳预测的比例是正预测值PPR，真阴占阴预测的比例是负预测值NPR 假阳占阳预测的比例是错误发现率，假阴占阴预测的比例是错误遗漏率概念源于条件概率，但是不以条件概率定义。正式定义：推论：独立了后，也独立，因为：所以也独立所有结合条件概率定义的公式也同样在新增条件的条件概率上成立正式定义：直观定义：，新增条件对我没影响由于形式上的一致性（条件概率的条件概率定义，条件概率满足概率公式），所有普通独立性的推论也对条件独立成立。给问题找一个可能性均匀的样本空间，然后用比例来代表事件的概率。这种概率也满足基本概率定义，因此也能用上面讨论的概率的一系列结论。因为不是所有问题都可以简单地找一个可能性均匀的样本空间来建模，特别是样本空间有无穷样本点的时候。所以需要能够表示可能性不均匀的样本空间。比如说身高，古典概型只能用数人头的方法来统计，搞出来的结果就是离散的，依问题而定的，不精确的。但是用随机变量+分布模型就能加一些预设来描述”背后真正的分布“，而且由于放到数轴上了，就可以沿用很多以往的代数成果。又因为用事件作为概率函数输入太麻烦了不好用实函数的工具，因此搞了个映射把样本点映射到数轴，然后用一个随机变量作为输入，创建一个函数给这些数字赋予概率，用数轴上的“和”来计算被选中区间代表的事件的概率——数轴离散的时候是实数代数加法，数轴为实数轴的时候用积分。实际上还是用集合表示的事件，但集合的表示已经变成了用随机变量的取值范围描述，而概率值的计算变成了用”求和公式“（离散求代数和、连续求积分）计算。概率值的计算仍然满足概率定义，计算出的概率就仍然可以应用上面讨论的所有概率。

烤肠

Thu, 05 Mar 2026 16:18:19 GMT

状态：冰柜完全冷冻后拿出空气炸锅：180度6分钟

Correlation

Thu, 05 Mar 2026 08:12:21 GMT

Problem Set 5

Wed, 04 Mar 2026 19:13:15 GMT

Problem Set 4

Wed, 04 Mar 2026 19:13:07 GMT

温泉蛋

Wed, 04 Mar 2026 14:42:45 GMT

交易

Wed, 04 Mar 2026 14:30:39 GMT

三步走：从亏到不亏从不亏到小赚从小赚到大赚买卖点：布林线触底买入，布林线触顶卖出，中线：下跌趋势中线卖出，上涨趋势中线买入，横盘中线不操作补仓策略：若买入后没有回调中线，以该价格作为参考，每下跌10%且触及布林线下线买入一份获利了结策略：若触及布林线上端后没有回调中线，一直持有买卖点：布林线触底买入，布林线触顶卖出，中线：下跌趋势中线卖出，上涨趋势中线买入，横盘中线不操作补仓策略：若买入后没有回调中线，以该价格作为参考，每下跌10%且触及布林线下线买入一份亏损达30%后买双份获利了结策略：若触及布林线上端后没有回调中线，一直持有异动应对：成交量稳步爬升/突发巨量，立即跟一份筹码若跌停/高位放量下跌，卖出所有筹码长上影触布林线顶卖出，长下影触布林线底买入一份分仓操作亏损幅度：10%低位横盘或上涨趋势满仓干，高位下跌趋势博弈反弹则半仓进双启明星接绿柱：抄底博弈一根反弹——得手就走突然大涨：后一日只接一根——等待磨穿成本再入场不接盘：放量绿柱，连续绿柱，上涨转下跌的趋势稳定的股票的下跌波段：超跌入场等回归均线，目标达8成卖出上涨趋势：趋势早期陪半仓，等待肉垫形成，另外半仓博弈回调，趋势早期错过直接满仓博弈回调博弈回调：只博弈一根，第二天无论输赢直接出超涨是卖出机会，超跌是买入机会。但都需根据历史股价序列来做判断。要考虑近期被套的人的心理市场价格由多空力量决定，市场价格反应多空博弈。做多做空被杀都会产生畏惧心理从而退缩而消停几日，但价格变化又会打破力量平衡带来短期反转。趋势早期会向旧趋势恢复，市场在拉锯后确认趋势后会趋势会继续发展。但多头力量反转趋势很少一波流，通常会和追涨筹码追求第二波上涨。诱多被杀通常意味着上涨阻力减小，机会更大。若出现多次明显进攻，则机会进一步增加。在达到上一波筹码高峰时通常会提前回调以释放压力，回调幅度越大后续上涨越强。上涨趋势转平转弱通常会动摇持仓者信心，大跌即将到来。但若未有效回调，前期趋势仍会坚定持仓者和新入市者的信心，削弱做空者信心。两根中阳线诱多的可能性大，后续十字星更能确认诱多，但击穿被诱多的筹码后可以博弈一根回调。超涨放量可能是二次诱多，但也可能是真拉涨，但若在高位或上涨趋势后可以考虑提前减仓，若后续只是有小回调可以博弈继续上涨。日K投机策略现象：国债逆回购利率暴跌解释：信心低谷/恐慌机会：做多银行等防御板块现象：一年内第二次事件驱动，并且前期已有上升解释：大量投机筹码出现，第一或第二个交易日将出现大量筹码集中获利了结机会：第一个交易日冲高卖出+做空现象：1分钟线出现大量买卖解释：多头入场收集筹码/拉高出货，预示着股票趋势的改变机会：分辨是收集筹码（放量后不涨/反跌）、抢筹（消息相关）还是拉高出货（吃一小口就跑）股票交易规则集市场参与方短线资金：风险偏好型，追求短期高收益和高资金周转率，为了高收益能承受大回撤耐心资本：风险敏感型，追求稳定增长，在价值低点投资布局，不会为了高收益牺牲稳定性散户热钱：缺乏投资知识和稳定的投资理念，属于跟风盘，容易受到市场情绪影响对市场价格造成波动基金：基金的实控人是基金经理，他们的利益在于管理的资金规模，基金效益越好，基金规模越大，收入越高。基金经理的约束是基金的合同，有时由于题材限制和仓位限制无法做到止损市场监管方市场的最终利益相关方是国家，国家维护的利益是统治者和其代表的统治阶级的共同利益。国家能对市场施加的重要影响包括：改变流动性：通过货币政策、财政政策或行政命令控制市场上的流动性，推动或抑制资产价格上涨国家的动机为：促进经济发展维护市场稳定，保护资产短期策略是和市场短期情绪博弈，如果没有影响市场价的能力，本质上是赌博，是火中取栗，风险很高。此时关注的是短期市场上涨的动力，即需要猜测主力的意图并选择跟随。由于市场动力主要来源于庄或市场情绪，而主力可以随时撤退，市场情绪容易被黑天鹅事件左右，需要小心控制风险。由于市场价格的极大不确定性，短期策略在没有信息优势的情形下很难赚取到超额利润，因此见好就收、达到预期就应撤退。甚至错过卖出最高点后也应撤退，因为短期策略买入能赚本身就是概率不大的事件，为了高估的收益继续持有最后导致亏损则是得不偿失的行为。价值策略的逻辑是低价购入，高价卖出，判断高低的价格分界线是资产的价值决定的。资产的价值取决于它未来能创造的收益。影响资产实际价值的有：资产价值是由其本身决定的市场的关注也可能对其产生影响市场参与方对资产价值的评估来源于：资产的价值可以由过去盈利能力体现，但这不适用于新型资产对资产未来盈利能力的预期价格总是在价值周围波动，无论这个周期有多长价格低于价值的资产不一定是好资产，因为如果当前低估的价格可能反映的是未来下跌后的价值。因此需要投资的是具有稳定价值如垄断型资产、支柱产业，或是具有成长性的资产例如新兴产业。价值投资者越是能精准把握每一次波动的转折点，就越是能赚取超额收益。投资的一种方式是低位买入，然后让资产跟随价值上涨，我称为被动投资。第二种方式则是逢低买入，逢高卖出，赚取波动带来的超额收益。当然如果过早卖出，过高买入，很可能会导致收益率还不如一直持有。因此只有在价格大幅偏离其原本价值的时候出手，才能保证在未来回调不充分的时候还能够上车。流动性改善：钱多了市场通胀了价格自然就起来了危机/繁荣：市场情绪被调动，影响交易价格成交量：恐慌性抛售、狂热抢筹根据价格的高低是不同的情形价格拉升：价格的迅速拉升可能是有投资资本需要短期套利股市策略

经济

Wed, 04 Mar 2026 14:28:58 GMT

中国人民银行我国央行不以盈利为目的，根据法律，央行的亏空由财政补充，盈利全部上交中央财政。唯一发钞行。美联储： FOMC是决策机构联邦储备委员会是由总统提名、参议院同意的7人委员会加上5名轮换的联储银行行长构成联邦公开市场委员会FOMC，7+5的组合每六至八周共同商讨制定政策。 12家联邦储备银行是职能部门负责执行政策、发行货币为商业银行提供贴现贷款、清算支票不以盈利为目的，盈利上缴财政部，亏损则延递到未来使用盈利填补。香港金融管理局HKMA 发钞行：渣打、汇丰、中银联系汇率：发钞行发钞时需要用等值美元购买外汇基金以“充值”港币强、弱方保证：HKMA将会通过使用美元买卖港元来保证1美元能兑7.75-7.85港元。央行的交易对手方是商业银行，商业银行通过放贷将流动性释放到社会中国债：金边债券国债收益率是无风险利率。出借方以国债收益率为最低利率，因此决定了贷款人的资金成本。 7天逆回购：固定利率，数量招标国债逆回购是指贷款人（通常是金融机构）以国债为抵押物，出价向市场借钱。而央行国债逆回购则意味着印钞机向市场以低利率撒钱，将利率控制到一个较低的水平，控制贷款人成本，增加市场上的流动性。央行国债逆回购：央行亲自下场参与国债抵押贷款市场债券购买 SLF：利率的上限，央行为商业银行提供的“紧急借钱”选项，因此所有成交价都不会高于这个利率超额准备金利率：除强制存入央行的准备金外，商业银行额外存入央行能获得的利率，即央行提供的“借钱”利率，让市场的所有成交利率都高于这个利率。MLF：中期借贷便利，是央行为商业银行准备的数量固定，利率招标的贷款。一年期为主。能够引导市场报价利率LPR的形成：LPR（贷款售价）是在MLF（商业银行成本）的基础上加点形成的。 MLF是你能拿到的利率的下限，是银行的成本价，而LPR则是你判断贵还是便宜的基准，是市场价。央行

市场

Wed, 04 Mar 2026 14:27:58 GMT

Drag from below or double clickSpace + Drag to pan⌘ + Scroll to zoom

央行

Wed, 04 Mar 2026 14:26:30 GMT

汇

Wed, 04 Mar 2026 14:26:25 GMT

结汇，汇款货币，汇，钞银行存款：将现钞存入银行帐户，银行在帐户上增加数字，将现钞存入银行仓库。购汇：银行用本币在市场上采购外汇，客户用本币从银行购买外汇，这些外汇能转移到自己或他人的境外帐户。结汇：将帐户上的外汇卖给商业银行换取人民币，央行通过使用人民币购买外汇来向商业银行投放人民币。跨境汇款(payment)：当持有银行汇款时，可以通过银行，经由SWIFT系统，把款汇到境外帐户上。汇款方式为电汇，从而在银行之间完成信息的传递，而后续的实际清算则在银行之间进行。外汇管制：在向外汇款时，需要备案汇款的用途，大额跨境汇款甚至需要申报以获得批准。离岸人民币：跨境支付通使用5w美元的便利额度，可以将在岸人民币转到香港的帐户上变成离岸人民币。SWIFT：Society of Worldwide Interbank Financial Telecommunications

股市规则集

Wed, 04 Mar 2026 14:25:30 GMT

每日共四小时交易时间，上午、下午各两小时：9:30-11:30，13:00-15:00 上午有15分钟盘前集合竞价，分为三个阶段：广告时间（可撤单），静默期（不可撤单），排队期（可提交，可撤单，均不处理）下午有5分钟尾盘集合竞价，只受买卖单，不可撤单在集合竞价阶段，交易所在一段时间内集中收取交易单，然后在一个时间点按出价和先后顺序集中处理这些单据。期间给出的是模拟交易成功的虚拟成交量、未匹配量、和虚拟成交价。

虹桥机场

Wed, 04 Mar 2026 14:24:35 GMT

2025-12-272号航站楼，值机柜台一字排开，各航司有自己的专用柜台，南航在最右边安检：走一件行李通道，水喝一口，电宝会查3c 舱门前要扫登机牌（航旅纵横电子登机牌可用）起飞： 12:33 开机驶离廊桥 12:45 开始加力，广播通知 12:47 正式启动 12:48 离地 12:49 收轮高度1100米地速470kmph

宝安机场

Wed, 04 Mar 2026 14:24:05 GMT

2025-12-23一个小时抵达华强南站11号线和7号线是分离的，需要走很多路红岭南已规划11号线新站点，届时9号线可转11号线 11:10分福星-岗厦北地铁噪音很大（不知道是风噪还是轮噪）岗厦北上了很多人宝安机场值机托运行李自助托运最好是侧边有提手的箱子，不然很难扫到码。宝安机场餐厅，吆喝多的别去，上菜没人管进值机厅一个安检（地铁到达免检），进候机厅需要刷身份证，如果预约了易安检可以快很多。随后有一个安检（严格），不能带水，带电池的设备都要拿出来，放篮子里，外套要脱下来。5开头的登机口在卫星厅，卫星厅要在安检后才能坐捷运到达。卫星厅也很多吃的喝的和购物的，不用着急在T3解决。14:12 开始登机十二分钟后就提醒即将截止登机 14:25 提醒登机已结束 14:36 开始牵引 14:38 到跑道准备起飞 14:57 等待了多架飞机降落、起飞，跑道清空后，起飞

摄影

Wed, 04 Mar 2026 14:22:45 GMT

RAW：所有传感器信息 JPEG：有损压缩 TIFF：RAW信息+编辑信息相片格式傻瓜模式 auto档：所有参数相机决定如果需要调整白平衡，ISO，曝光补偿： P档程序自动，相机自动控制进光量（光圈和快门） A档光圈优先，由用户决定光圈，相机测光决定快门速度 S档快门优先，由用户决定快门，相机测光决定光圈 M档全手动光圈：大光圈浅景深快门：低快门易糊白平衡：温度低中高对应红色白色蓝色。温度是白色点平衡点，让橙色发白降低温度，让蓝色发白升高温度。相机不同档位成像原理光圈的表达式为例如。这描述的是光圈的绝对直径大小，是焦距。 F-number 可以理解为焦距和光圈直径的比值，即打到中心焦点的光线的角的正弦值的倒数可以理解为光圈相对于焦距的比例。光圈和光圈的进光量比值为。因此比要多进倍光我们说什么「大光圈」「小光圈」实际上说的不是光圈的直径，而是焦距和光圈直径的比值「F-number」。大光圈意味着小F值，意味着折射进镜头的光角会变大，相同距离产生的弥散圆会变大。光圈从小调大后，原本小于传感器像素的弥散圆现在会大过传感器像素的大小，覆盖很多像素，成为光晕；而原本小的光晕则变成更大的光晕。光圈ISO值衡量相机对光强的敏感程度。相机感光过程：浮动扩散节点FD（Floating Diffusion）将电子转化为电压信号，源极跟随器SF的栅极接受来自FD的电容的电压信号并接力给后续的电路模拟放大电路PGA（Programmable Gain Amplifier）按ISO/基础ISO（通常是100）的比例放大读出的电压信号，交给模数转换器ADC（Analog Digital Convertor）。 ADC将强度相近的模拟信号归到一个水平的数字信号中因此相机提升感光度的方式有：增加电子-电压增益提高PGA放大倍率放大数字信号由于相机提升感光度的策略一般基于ISO的数值范围，因此也产生了几个概念：基准ISO（Base ISO）：相机传感器未经放大下的感光度。原生ISO：经模拟电路放大的叫原生ISO，放大的是模拟信号，约在100-12800 扩展ISO：通过放大数字信号来提亮画面，机内提亮和后期提亮是同一种操作，没有区别。不同相机的感光度提升机制不同，因此ISO的具体意义是由语境定义的：一般的数字相机：增加感光度就是在放大信号，低感光度放大模拟信号，在模拟信号放大倍率到上限的时候放大数字信号。双转换增益Dual Conversion Gain相机：这种相机有两个电容，也就有两个基准ISO，在光线充足时使用低增益电子-电压转换，能够容纳更广的光强范围；而在低光环境下，使用高增益电子-电压转换，从而能够避免使用PGA模拟放大，从而避免所有前端噪声都被放大。选择更高的ISO是在命令相机提高感光度，对DCG相机而言，ISO提升到超过一个阈值就会选择使用高增益电路放大电子信号来替代PGA的模拟信号放大。ADC接受的电压是有上限的，接受到的电压超过这个上限就为过曝。因此ISO放大倍率过大动态范围是一个信号概念，是最大信号值/最小信号值的对数，用分贝表示。在相机中，就是满阱容电子数量/读出噪声电子数量的二对数，曝光指数EV（Exposure Value）则由读出电子数/读出噪声电子数的二对数计算得到。提高ISO：底噪被拉升，ADC接收到的有效电压范围缩小，即最大电压不变，最小电压提高，因此动态范围缩小。进光量：传感器实际接收到的光量，由快门和光圈决定。高快门不易糊，小光圈景深深。ISO的作用：让ADC能够更好地解析欠曝照片。由于ADC的分辨率是有上限的，如果进光量不足，导致传感器模拟部份读取的图像动态范围不足，会导致ADC无法很好的分辨电压变化，导致不同的模拟信号被解析成相同的水平的数字信号。因此提高ISO能让欠曝的相片被ADC更好地解析。对比后期数字放大：虽然提高ISO也会放大噪声信号，但是相比后期数字放大，由于ADC能够更好解析。图像的细节更丰富。而且不会有ADC噪声被放大的问题。如果进光亮够大，那么直接使用原生ISO，如果太大，在原生ISO下都过曝，则需要减小进光亮。阻抗：电路对交流电的总阻碍，包含电阻和电抗两个部份，电抗又包含容抗和感抗两个部份，容抗来源于电容对变化电压的阻碍作用，感抗则来源于因电流变化产生的自感电动势。增益：输入与输出的比值动态范围：信号最大值与噪声最大值的比值的10对数，单位为分贝dB曝光指数EV：EV用来指示相机的进光量，EV0等效ISO 100，光圈f/1.0下的进光量。EV补偿则是用来调整相机的自动进光量决策。大光比ISOFocal Length焦距被定义为透镜中心到光线聚焦点的距离。易混概念：「调整焦距」和「对焦」是两种不同的操作。「对焦」是调整镜头的透镜组，从而调整整体折射率，使得物像能够清晰的呈现在传感器上。放大：在透镜大小不变的情况下，焦距越长，能通过透镜折射到相平面的光线角度就越小，因此拍出的相片大放大背景：拉开与主体的距离，使得主体覆盖的视场角度和近距离短焦的视场角，对于相同主体，由于背景的视角变小，因此背景整体看起来像是被放大了。浅景深：由于高f值意味着低屈光能力，因此与对焦物相距相同距离的物，在传感器上的弥散圆在高f下会比较大，由此景深也会比较浅。考虑对焦无限远的物体，即收集平行光线到焦平面上。相机的允许弥散圆直径是固定的，因此对比起广角镜头，长焦对焦无限远的物体时，弥散圆的透镜成像公式：即焦距倒数为像距倒数和物距倒数和。“Pasted image 20260122051219.png” could not be found.相机的对焦是在焦距不变的条件下，根据物距调整像距，让感光器落在要成像的物体的像平面上，而变焦则是改变透镜组的焦距人眼的对焦是在像距不变的条件下，根据物距调整焦距。焦距随着物距增大而增大，，因此看向近处的物体时，焦距变小，此时可以发现远处的物体也开始变得紧凑，这正是焦距变小导致视场变大的结果。真性近视是因为像距变长，远距离的物距需要的焦距也变长，超过了晶状体最放松状态下的焦距，因此需要凹透镜来“帮忙延长”焦距；假性近视则是因为晶状体紧张，无法放松到更大的焦距。老花则是因为晶状体弹性变弱，焦距缩短变难，因此需要凸透镜来帮忙屈光。简单来说，真性近视是：焦距调节范围与需要范围错位；假性近视和老花是：焦距调节范围变小。焦距机械卷帘快门：前帘落下，CMOS清空电荷，前帘打开，CMOS开始曝光，后帘落下， CMOS结束曝光，CMOS数据读取完成，后帘打开复位。电子快门：逐行曝光，延迟一段时间后逐行读取，延迟的时间是曝光时间，读取速度决定快门速度电子前帘快门原理：电子前帘开启曝光，机械后帘遮光后读取，能够分离曝光和读取阶段，利用机械后帘的速度优势，减小果冻效应。劣势：两个快门不在同一平面，后帘在焦平面前，而前帘在焦平面，因此在快门间隔很小的时候，光斑会被裁切焦外弥散圆：由于前后帘不在同一平面，前帘在会遮挡底部的前景弥散圆高快门的效果：进光量低，能够在大光圈下抑制高光模糊少快门和包围曝光的区别：HDR是图像亮部暗部拼接，包围曝光是叠加HDR浅景深就是要在像平面上产生足够大的弥散圆大光圈（高F值）：高F值意味着进光角度变大，弥散圆更大长焦：长焦屈光能力弱，弥散圆更大近物距：物距越近，要达到大小的弥散圆需要的物距差就越小，因此景深浅。使用浅景深：大光圈搭配短快门保证进光量，一般也是更加聚焦，近距长焦就适合，太大的话远距长焦也可以。使用深景深：小光圈搭配长快门保证进光量，一般也是要拿更多信息，远距广角就适合景深di ku运动模糊

FunSearch

Tue, 03 Mar 2026 07:58:55 GMT

Capacitated Vehicle Routing ProblemFunSearch: Searching for functions written in computer code.Idea: a pre-trained LLM (creativity) + an automated evaluator (guard against incorrect ideas)Problem description: a procedure to evaluate the programs a seed program used to initialize a pool of programs Self-Improving loop: Pick some programs from the pool Generate and evaluate The best one are added back to the pool Originally Google's PaLM 2 avoid stagnation: improve diversity of ideas efficiency: run the evolutionary process in parallel show-your-working approach: favor concise and human-interpretable programshuman-AI collaborationFunBO best-short prompting: sample best performing programs skeleton: only evolve the part governing the critical program logic an island-based evolutionary method: maintain a large pool of diverse program and avoid local optima. scaling it asynchronously. using fast-inference model: on the order of combinatorial competitive programming competitive programming

分仓操作

Tue, 03 Mar 2026 07:26:22 GMT

2025-11-24 爆竹投机心得

Tue, 03 Mar 2026 06:23:06 GMT

条件：频率：每日最多交易一次价格：以收盘价交易资金量：1w起手续费：万三不免五算一笔账：万三不免五下，1万元每次买卖来回损耗大约16元，也就是大约+0.16%的最低涨幅为成本线。五万元内可以使用+1%的成本线。成本劣势：追涨建仓意味着如果趋势改变，你的成本比前后进场的筹码都要更高，因此承担了更大的风险、收益也更低。追涨时机：追涨建仓风险极高，因此需要匹配极大的收益，如低位爆量的时候。而且由于成本劣势，需要及时卖出。正确做法：买在跌时，卖在阳时或缩量的时候。盈亏比：要是先亏x%，那就要赚，要是后亏x%，那就得先赚，盈亏比为，按这个公式，亏10%要赚回11.11%，亏20%要赚回25%，亏33%要赚50%，亏50%要赚100% 规避亏比冒险赚更重要：宁愿少赚50个点，也不亏30个点。经历多次调整的上涨阶段的破周期大跌：可能是双周期中大周期底部，但更可能是单周期趋势的结束。警惕庄家诱多：庄家通常通过诱多的手法，在赚取少量差价的同时教育市场，从而在最后拉升的时候削弱散户进场的可能性。连续两根稳定/放量的小阳线：下跌趋势在阳线上建仓就是找死，跌是市场共识，涨是大资金行为。一般来说，此时资金诱多杀空的可能性比直接拉涨更大，风险收益比极高，当然具体情况具体分析。三连阳、明显的超跌可以尝试半仓：设置5%的止损，赢钱赚一点，亏钱亏不多。如若有人可以制造上涨趋势，就等待大跌陪一个全仓，第二天若跌减半仓，再跌全撤认栽。暴跌风险：上涨可以曲折发展，但跌完可以是一波跌完。机会成本：涨到高处时卖出股票，短期连续上涨的机会成本已经较低了，但是下跌后补仓的机会更大。高近期成本：割肉意味着这次判断错了，而且已经累积了比较高的近期成本，相当于进入后期的一个可能的套牢盘，如果再参与后续的诱多点，相当于还没赚就先承担吃两次下跌的风险，而且该风险已经兑现了一半。正确做法：收紧进场条件普通人的精力很难做到正常人的时间不允许对理性交易有损对于不该跌的股票，若判断为洗盘，可以选择跌卖一半，再跌买回来，早期洗盘中间可以暂停一两天等更低的价格，一直执行这个循环直到价格稳住，然后拿住不动。但永远有筹码在手里。市场心理上涨趋势的第一次大跌，引发恐慌性洗盘，加仓机会，视情况分仓或全仓进场市场主力发动一次反弹大概率会有洗盘但难是诱多，主力至少还要再诱多一次。快速触底通常意味着底部将要被破，但刚破可以博弈一根，但要警惕连续破位。

股市策略

Tue, 03 Mar 2026 06:22:59 GMT

市场参与方短线资金：风险偏好型，追求短期高收益和高资金周转率，为了高收益能承受大回撤耐心资本：风险敏感型，追求稳定增长，在价值低点投资布局，不会为了高收益牺牲稳定性散户热钱：缺乏投资知识和稳定的投资理念，属于跟风盘，容易受到市场情绪影响对市场价格造成波动基金：基金的实控人是基金经理，他们的利益在于管理的资金规模，基金效益越好，基金规模越大，收入越高。基金经理的约束是基金的合同，有时由于题材限制和仓位限制无法做到止损市场监管方市场的最终利益相关方是国家，国家维护的利益是统治者和其代表的统治阶级的共同利益。国家能对市场施加的重要影响包括：改变流动性：通过货币政策、财政政策或行政命令控制市场上的流动性，推动或抑制资产价格上涨国家的动机为：促进经济发展维护市场稳定，保护资产短期策略是和市场短期情绪博弈，如果没有影响市场价的能力，本质上是赌博，是火中取栗，风险很高。此时关注的是短期市场上涨的动力，即需要猜测主力的意图并选择跟随。由于市场动力主要来源于庄或市场情绪，而主力可以随时撤退，市场情绪容易被黑天鹅事件左右，需要小心控制风险。由于市场价格的极大不确定性，短期策略在没有信息优势的情形下很难赚取到超额利润，因此见好就收、达到预期就应撤退。甚至错过卖出最高点后也应撤退，因为短期策略买入能赚本身就是概率不大的事件，为了高估的收益继续持有最后导致亏损则是得不偿失的行为。价值策略的逻辑是低价购入，高价卖出，判断高低的价格分界线是资产的价值决定的。资产的价值取决于它未来能创造的收益。影响资产实际价值的有：资产价值是由其本身决定的市场的关注也可能对其产生影响市场参与方对资产价值的评估来源于：资产的价值可以由过去盈利能力体现，但这不适用于新型资产对资产未来盈利能力的预期价格总是在价值周围波动，无论这个周期有多长价格低于价值的资产不一定是好资产，因为如果当前低估的价格可能反映的是未来下跌后的价值。因此需要投资的是具有稳定价值如垄断型资产、支柱产业，或是具有成长性的资产例如新兴产业。价值投资者越是能精准把握每一次波动的转折点，就越是能赚取超额收益。投资的一种方式是低位买入，然后让资产跟随价值上涨，我称为被动投资。第二种方式则是逢低买入，逢高卖出，赚取波动带来的超额收益。当然如果过早卖出，过高买入，很可能会导致收益率还不如一直持有。因此只有在价格大幅偏离其原本价值的时候出手，才能保证在未来回调不充分的时候还能够上车。流动性改善：钱多了市场通胀了价格自然就起来了危机/繁荣：市场情绪被调动，影响交易价格成交量：恐慌性抛售、狂热抢筹根据价格的高低是不同的情形价格拉升：价格的迅速拉升可能是有投资资本需要短期套利

股票交易规则集

Tue, 03 Mar 2026 05:56:09 GMT

现象：国债逆回购利率暴跌解释：信心低谷/恐慌机会：做多银行等防御板块现象：一年内第二次事件驱动，并且前期已有上升解释：大量投机筹码出现，第一或第二个交易日将出现大量筹码集中获利了结机会：第一个交易日冲高卖出+做空现象：1分钟线出现大量买卖解释：多头入场收集筹码/拉高出货，预示着股票趋势的改变机会：分辨是收集筹码（放量后不涨/反跌）、抢筹（消息相关）还是拉高出货（吃一小口就跑）

日K投机策略

Tue, 03 Mar 2026 05:28:49 GMT

亏损幅度：10%低位横盘或上涨趋势满仓干，高位下跌趋势博弈反弹则半仓进双启明星接绿柱：抄底博弈一根反弹——得手就走突然大涨：后一日只接一根——等待磨穿成本再入场不接盘：放量绿柱，连续绿柱，上涨转下跌的趋势稳定的股票的下跌波段：超跌入场等回归均线，目标达8成卖出上涨趋势：趋势早期陪半仓，等待肉垫形成，另外半仓博弈回调，趋势早期错过直接满仓博弈回调博弈回调：只博弈一根，第二天无论输赢直接出超涨是卖出机会，超跌是买入机会。但都需根据历史股价序列来做判断。要考虑近期被套的人的心理市场价格由多空力量决定，市场价格反应多空博弈。做多做空被杀都会产生畏惧心理从而退缩而消停几日，但价格变化又会打破力量平衡带来短期反转。趋势早期会向旧趋势恢复，市场在拉锯后确认趋势后会趋势会继续发展。但多头力量反转趋势很少一波流，通常会和追涨筹码追求第二波上涨。诱多被杀通常意味着上涨阻力减小，机会更大。若出现多次明显进攻，则机会进一步增加。在达到上一波筹码高峰时通常会提前回调以释放压力，回调幅度越大后续上涨越强。上涨趋势转平转弱通常会动摇持仓者信心，大跌即将到来。但若未有效回调，前期趋势仍会坚定持仓者和新入市者的信心，削弱做空者信心。两根中阳线诱多的可能性大，后续十字星更能确认诱多，但击穿被诱多的筹码后可以博弈一根回调。超涨放量可能是二次诱多，但也可能是真拉涨，但若在高位或上涨趋势后可以考虑提前减仓，若后续只是有小回调可以博弈继续上涨。

Search Problems

Mon, 02 Mar 2026 20:31:54 GMT

State Space Graphs Nodes: abstracted word configurations Arcs: Successors Goal test: goal nodes Traveling in Romania fringe of partial plans expand out potential plans (tree nodes, as few as possible) Fringe in tree search: the collections of all nodes that have been generated but not yet expanded. Initialize the tree with the start state. while having candidates nodes： goal test True：return solutions False：add to tree and expand nodes, for expanded nodes, update the fringe. each node has only one parent. parent is updated if the cost is lower.Properties： Complete: Guaranteed to find a solution if one exists. Optimal: Guaranteed to find the least cost path Time complexity? Space complexity? is the branching factor is the max depth nodes Depth First Search Strategy: explore the deepest node first Implementation: fringe is a LIFO stack Breadth First Search Strategy: expand the shallowest node first Implementation: fringe is a FIFO queue Run a DFS with an increasing depth limit depth limit: actually it's BFS space efficiency: use a stack costs Properties: waste: search with depth limit is re-executed as in search with limit , but waste is managable as the time complexity is determined with Uniform cost search UCS Strategy: expand a cheapest node first Implementation: fringe is a priority queue( priority: cumulative cost) Properties: UCS: Process all nodes with cost less than cheapest solution Summary: DFS: space efficient, but might not be time efficient BFS: costs a lot of space, but can find the shallowest solution as soon as possible. UCS: greedily explore the cheapest solution. Explore increasing cost contours Search is simulated (not practice in real world) Your search is only as good as your models Uninformed search: DFS, BFS, UCS, search with your eye closed Informed search: Greedy, A*, search with your eye openHeuristics: A function that estimates how close a state is to a goal. Strategy: pick the nodes according to the heuristic function Example: Romania Heuristics function: straight-line distance to Bucharest Not work in many cases: a common case: Best-first takes you to the wrong goal. Uniform-cost: backward cost Greedy: forward cost: A* Search: Property： Actual goal cost < Optimal Inadmissibility heuristics break optimality by trapping good plans on the fringe. Admissible: optimistic (Actual goal cost > Optimal), ensuring that the solution is optimal. Assume: A is an optimal goal node, B is an sub-optimal goal node , is admissible Claim: Optimal node A will exit the fringe before sub-optimal node B Admissible heuristics are solutions to relaxed problems 查地图的时候：深圳-上海 UCS：先查深圳去哪个交通枢纽花费时间最少，一圈圈查到上海 Greedy：先查深圳哪个交通枢纽离上海最近，然后一个节点一个节点查到上海 A*：查深圳到哪个交通枢纽的时间+交通枢纽离上海距离近。然后渐渐扩展到上海 updated before dequeuing: 如果Goal Node还在Fringe，更新Goal Node Admissible： Proof of Optimality of A* Tree Search is the priority, is the cost from start to here, and is the heuristic from here to goal. goal nodes: a goal node is a node in the search tree, can be the same node in the graph an optimal goal node is the node has the smallest value. a suboptimal goal node is the node that has admissible: for any node , , is decided, so should be choose. any ancestor of will be expanded because because is admissible more specifically, because is optimal and is suboptimal because and Bin Packing Problem Online Bin Packing: Search function with a well designed heuristic FunSearch: Write a simple best-fit heuristics as the first heuristic loop: let LLM modify based on the current heuristic (expand) evaluate these heuristics, add them to the database Uniform search: pick node goal check, if goal, stop span go to 1 Three different searching algorithms are just uniform search guided by different priority: Uniform cost search: only cost Greedy: only heuristic A* Search: cost + heuristic

Adversarial Search

Mon, 02 Mar 2026 20:13:49 GMT

Mini-Max: 反向搜索：假设两个玩家完全理性，而且都知道对方完全理性，从结果倒推，每一步两个对抗的玩家会怎么选择。一直轮流span节点，直到结束状态，比较当前路径得分和已有的得分，根据每一层执行者的利益，选择是否更新得分并向上传递路径。 Set initial value to the worse case update the value to be returned by querying values from descendents and reduce unnecessary search(pruning) by passing the feasible set of values () to its children. max node: prune if . No need to search more, the value return is deemed to be cast by its min parent. min node: prune if . What minimax search is saying is: for each internal node: collect optimal results from its descendants, delegate this task to its children, then the node concludes the result collected and return to its parent.To optimize the search, we need to pass information down. These internal nodes can specify what kinds of results is acceptable to let its children know when to stop.In this minimax search case, a max node can issue the "result collecting request" with this "requirement": only explore routes that would produce score between and . And we let the node issues this request to its children iteratively, so it can adjust its requirement parameter (assigned with the current potential max score) and (which would not actually be adjusted).So what max node is doing when it is calling is children is just like saying: "hey boy, I've asked your brothers and found that the biggest value is , and my parent say he wants a score smaller than , so go find it!". What's more, when the max node found , it knows no more legal value would be found, so it simply stops and returns.

Midterm

Mon, 02 Mar 2026 19:25:45 GMT

Session: 12:00 pm - 1:50 pm LI 2505

分类问题

Mon, 02 Mar 2026 19:23:24 GMT

好的分类模型，是要对每个分类都能正确分类（True Rate），在作出的预测里，正确预测的比例也要高（Predictive Value）。将预测值作为行索引，真实值作为列索引，建立一个的Confusion Matrix ，代表模型处理样本集后，被预测为，实际为的样本个数。当单独增加，由于样本数量是固定的，只有一种情形：减少，那么某些就会增加，那么对应的就会增加，除非也增加，或减少，找到一种方式，让都增加。从概率的角度看，可以当作联合概率分布的近似：样本里被正确预测的比例：Sensitivity, Specificity 作出的预测里正确的比例：Positive Predictive Value, Negative Predictive Value 研发时一般得到的是第一个，但是实际使用时关心的是第二个

假设检验

Mon, 02 Mar 2026 19:21:06 GMT

提出两个假设，一个零假设，在该假设下计算概率，一个与零假设对立的备择假设，在零假设被拒绝的时候选择。设立一个显著性水平，依赖抽取的样本，当计算出的概率P统计量参数参数是描述抽象的“总体”的，而统计量则是描述能获得的“样本”的。可以设计不同的统计量，并推导得到它们的分布即抽样分布抽样分布卡方分布是卡方这一统计量的分布 t分布是t统计量的分布置信区间

Stochastic Process

Sun, 01 Mar 2026 07:46:31 GMT

a family of random variables the index of the family often has the interpretation of time Brownian motion: Bachelier, The Theory of Speculation Poisson process: Erlang, the number of phone calls occurring in a period of time

ML Lesson 3 多元高斯参数MLE

Sat, 28 Feb 2026 18:50:36 GMT

derivatives of vector二次型多元高斯分布概率密度函数MLE of multi-variant Gaussian, given DNote Since is symmetric, is symmetric$$ A^T = A \Longrightarrow (A^{-1})^T = (A^T)^{-1} = (A)^{-1} \begin{align}\Sigma^* &= \arg\max{\Sigma} - \frac{p}{2}\log \Sigma - \frac{1}{2} \sum_i (x_i - \mu)^T \Sigma^{-1} (x_i -\mu)\ &= \arg\min\Sigma p\log\Sigma + \sumi \sum_j \sum_k (x_i-\mu){j}\Sigma^{-1}{jk}(x_i-\mu)_k\ &= \arg\min\Sigma p\log\Sigma + \sumi \sum_j \sum_k \Sigma^{-1}{jk}(xi-\mu)_k(x_i-\mu){j}\ &= \arg\min_\Sigma p\log\Sigma + tr(S\Sigma^{-1}) \\frac{\partial }{\partial \Sigma} &= -p\Sigma^{-1} + \Sigma^{-1}S\Sigma^{-1} \\Sigma^{-1}S &= p I \ \Sigma &= \frac{1}{n} S\\end{align}\begin{align} tr(ABC) &= \sumi (ABC){ii} = \sumi \sum_j A{ij} (BC){ji} \ &= \sum_i\sum_j A{ij} (\sumk B{jk} C{ki}) \ &= \sum_i\sum_j\sum_k A{ij}B{jk}C{ki} \ \end{align}\begin{align}\frac{\partial f}{\partial X} &= \begin{bmatrix} \frac{\partial f}{\partial x{11}} & \dots & \frac{\partial f }{\partial x{1n}} \ \vdots & \ddots & \vdots \ \frac{\partial f}{\partial x{n1}} & \dots & \frac{\partial f}{\partial x{nn}} \ \end{bmatrix} \\frac{\partial a^TXb}{\partial X} &= ab^T \\frac{\partial tr(AX)}{\partial X} &= \frac{\partial \sumi \sum_j a{ij}X_{ji}}{\partial X} = A^T \\frac{\partial tr(AXB)}{\partial X} &= \frac{\partial \sumi\sum_j\sum_k A{ij} X{jk} B{ki}}{\partial X} = \frac{ \partial \sum{j} \sum{k} \left(\sumiA^T{ji}B^T{ik}\right) X{jk}}{\partial X} = A^TB^T \\frac{\partial tr(X^TAX)}{\partial X}\end{align}\begin{align} a^TXb &= \sum a{i} x{ij}b{j} \ (AXB){ij} &= \sum A{ik}X{kl}B_{lj} \ \end{align}p(x) = \int p(x | \theta) p(\theta | D) d\thetatrick: completing the squareapply the trick "completing the square":所以也满足单元高斯的形式，且参数为：整理得：, use prior knowledge as increase, the posterior gaussian use more data, and be more and more confidentUse to approx

Convex Optimization

Tue, 24 Feb 2026 12:58:24 GMT

is a convex set. is a convex function. Given: Point x, Point yConvex combination of x and y: A point between two pointsa convex combination of them is any point of the form strict convex combination: another name: Interpolation any point between x and y is called interpolation, any point outside is called extrapolation All there convex combination of two points in the set is also in the set.A set is convex if ,Conceptually: the value on the mid point is lower than average valueConvexities is preserved: Sum of convex functions is convex Convexity is preserved under a linear transformation second partial derivatives is positive semidefinite on the interior of SPD (Semi-Positive Definite) eigen-decomposition eigenvalues, eigenvectors: all eigenvalues of are non-negative alternatively: check Descrete: Search Iteratively improving an assignment Continuous: gradient Gradient Descent:initialize $x \leftarrow$ x_0: repeat $x \leftarrow x - \alpha \nabla _x f(x)$; unti convergence; how to choose initial point how to choose and update step-size how to define convergence Newton-Raphson Method: if twice-differentiableiterate until convergence: Model a problem as a convex optimization problem define variable, feasible set, objective function prove it its convex (convex function + convex set) Build up the model Call a solver fmincon, cvxpy, cvxopt, cvx

Optimization

Tue, 24 Feb 2026 11:58:17 GMT

Traveling Salesman Problem cities, distance visit every cities with minimal total distance indicator functionn-Queens Problem location of the queen in each column variable Feasible set: or Objective function (dummy) No general way to solve: No free launch theorem It's always possible to fabricate one problem that your problem solver can't solve Convex optimization problem (CO): GD, SGD Linear Program: simplex, interior point (Mixed) Integer Linear Program (MILP) Decoupling "representation" and "problem solving": Lazy mode Formulate a problem as an optimization problem Identify which class the formulation belongs to Call the corresponding solver

ML Lesson 5 Lagrange Multiplier

Sun, 22 Feb 2026 03:55:11 GMT

Lagrange multiplierE step:M step: Non-parametric Apply the Kernel to datapoints: Stretch: by bandwidth Sum and Normalize: Bandwidth Parameter controls smoothness of is small: more peaks is large: less peaks, cover many samples Mean: convolution between true p(x) and kernel.set h according to physical property of sensing of set h to maximize LL of held out condition set.Mean shift

ML Lesson 4 EM algorithm and Gaussian Mixture

Sat, 21 Feb 2026 01:06:20 GMT

iterative methodmaximum a posteriori (MAP) estimates of parametersexpectation step: create a function for the expectation of the log-likelihood evaluated using the current estimate for the parameters. maximization step: compute parameters maximizing the expected log-likelihood found on the E-step.Can be used to estimate a mixture of gaussians.Model The equation cannot be solved directly involve latent variables unknown parameters known data observationsFinding a maximum likelihood solution: taking the derivatives of the likelihood function w.r.t all the unknown values, the param, the latent variable simultaneously solving the resulting equations problem: interlocking condition between solving param and variables derivatives equationslocal maximum or a saddle pointobserved data: Suppose the model is a gaussian mixture, then the params are: mixture parameter: parameter for each Gaussian E-step: Responsibility? (likelihood of belongs to -th gaussian)M-step: let , which is the "number" of data allocated to -th gaussian

ML Lesson 2 MLE

Fri, 20 Feb 2026 14:22:19 GMT

Likelihood function:Log Likelihood function: consistent: as , the estimated value converges to the true value. Asymptotically unbiased. efficient: has the lowest variance of any unbiased estimator. (achieves CR lower Bound, a theoretical bound on the variance of any unbiased estimators.) Assumption: Model: Polynomial, with model, error is determined. Error: Gaussian, this introduce the probability. Supervised Learning: Dataset we observe a noisy output given input : 这样每份数据出现的概率是：MLE 多项式模型系数数据的出现概率怎么衡量？对数据做假设：假设数据的关系是多项式；假设剩下的误差项服从正态分布且独立同分布。因而y的值也符合某个由x决定的正态分布，这样每份数据的概率就可以计算了。由每份数据的独立性，求出数据的对数似然度并找到最大化似然度的模型参数。 Given D, use MLE:This is Least Squares Regression explicit assumptions: Gaussian Noise iid samples ML can describe different form weighted LS：每个样本的误差非同方差（误差方差大的样本用小权重） regularized LS：Ridge、Lasso Lp-Norm LS：误差分布不符合正态假设 Ridge Regularized LS 有解析解：

2026.2.9 交易预案

Mon, 09 Feb 2026 11:13:04 GMT

持仓情况：标的,持仓,数量,盈亏黄金,2670.29,2.3882,962.31 恒生互联,10393.50,20500,-893.70 保险主题,6385.00,5000,50.00 长江电力,5292.00,200,-9.00 中国海油,3409.00,100,786.13 美元信用短期尝试修复？低位高股息-冀中能源题材：东南亚贸易-北部湾港整体策略：投资多少个标的当前仓位需要多重标的：黄金：美元信用崩塌，世界局势动荡石油：国家工业安全，垄断型企业红利：10%+高股息，阶段性低位中证保险：保险资金入市，业绩预增黄金、石油交易渠道：etf还是积存金当前策略：看多还是看跌：当前仓位不动，后续有跌红利标的：冀中能源煤炭周期是否结束？仓位应该多少？当前持仓：中国海油时代浪潮：中证报险

Lesson 4 Transformation

Fri, 06 Feb 2026 12:51:57 GMT

can be written as Scaling factors: , Uniform scaling Reference point should not change translate to original point scaling undo the translation leave all points on one axis fixed, while the other points are shifted parallel to the axis by a distance proportional to their perpendicular distance from that axishorizontal shearing (x-shearing)vertical shearing (y-shearing) is given by:Since Translation, Scaling, Rotation, Shearing is invertible, any transformation composed of them is invertible.3D translation, scaling: introduce 1 more dimension.3-step trick: translate to origin scaling/rotation w.r.t origin trnaslate back to ref point 3D rotation: rotation axis: defined by two points P1, P2 rotation angle not commutative Example: axis: theta: Solution： translate the axis pass through the origin (move (0, 1, 1) to (0, 0, 0)) apply rotation translate back

Lecture 4 Transformers and pretraining-finetuning

Wed, 04 Feb 2026 12:02:48 GMT

Attention Transformer BERT GPT

T3

Wed, 04 Feb 2026 11:25:18 GMT

Finetuning tool: SWIFT finetune LlamaFactory Verl: Reinforcement Learning for LLMsOpenRLHF

资本运作

Tue, 03 Feb 2026 23:07:32 GMT

普通方式自己出资/找人合伙，自己承担风险，等待收益劣势：资金被锁死低成本杠杆卖项目收益权给A公司，换取股份，然后等股票大涨后套现离场低投入项目快速复制对赌：自己负责运营，出资方负责出资，项目回本前股权质押给出资方，项目回本后按股权比例分成。卖出未来收益权，快速套现重复商业模式

鞋

Tue, 03 Feb 2026 11:06:58 GMT

休闲鞋徒步鞋/户外鞋：不容易脏，易清洁场景：通勤/City Walk：顶得住2W步暴走无压力面料：三防：Cordura 防泼水透气：Gore-Tex

CG Lesson 3 Object Modeling

Fri, 30 Jan 2026 12:52:43 GMT

convert the object into pixel patternDraw a line: compare distance to center upper and lower pixel and pick the closer one. Draw a circle: shade the neighboring pixel that minimize Flood-fill: pick a starting point, fill 4/8 different directions unstructured set of 3D point samples acquired from range finder, computer vision lightweight format: no connectivity information difficult surface reconstruction difficult to perform geometry computationProcess 3D point cloud isometric mapping to map 3D point cloud data to 2D planePolygon Meshes is the set of vertices each vertex must belongs to at least one edge is the set of edges each edge must belongs to at least one face an edge is a boundary if it belongs to only one face is the set of faces orientation: counter-clockwise / clockwise counter-clockwise is the 'front" side (right-hand rule) Euler Formulapolyhedron (a closed manifold mesh without holes/handles)convex shape: convex: select and connect any two point, the line is always in the shape concave: select and connect any two point, the line is always outside the shape Genus: number of hole of the modelclosed orientable manifold mesh of genus : $$ |V| - |E| + |F| = \chi = 2-2g face-vertex: list of vertices (coordinates) + list of faces (set of vertices) winged-edge half-edge quad-edge corner-tables Catmull-Clark Subdivision Surface average of the vertices of the face mid-point of each edge Use an implicit function to define the surface: Signed Distance Field sign: inside, outside distance: d(x) signed distance Field (SDF) Parametric Curves (Cubic Parametric Polynomials): matrix, : matrix, control points : Basis MatrixVoxel = Volume Pixeldecompose the object into identical cells arranged in a fixed regular 3D grid.octree representationrecursively applying the same transformation function on a given object;

Lecture 3 Word Embedding

Wed, 28 Jan 2026 13:39:48 GMT

Linguistic way of thinking of meaning:Signifier (symbol) Traditional solution: WordNet maintain a list of synonym and hypernyms. synonym: similar words, synonyms of "good" are well, beneficial, honorable hypernyms: hypernyms of 'red' hypernym: Scarlet, crimson, vermilion encode similarity into vectorssimilar contexts tend to have similar meanings "You shall know a word by the company it keeps." -- J .R. Firth. 1957 modern statistical NLP Laten Semantic Analysis: co-occurrence counting matrix + SVD Collobert & Weston vectors: 1st neural pretrained word embedding Word2vec GloVe Word embeddings - goal word embeddings / word vectors / word representation build a dense vector for each word A word vector should be similar to vectors of words that appear in similar contexts A distributed representation Idea input: a large corpus of text output: word vector method: skip-gram: skip the center "gram". CBOW: , predict the center word. skip-gram Likelihood: objective likelihood: NLL: structure input vector is one-hot representation of the center word. output vector is the possibility of the word represented by the neuron appears nearby. network: onehot -> W -> W softmax , : Output / Input word : the row of the weight matrix in the second layer that responsible of the output word : the row of the weight matrix in the first layer that responsible of the input word Negative sampling (1)NCE (Noise Contrastive Estimation)maximize prob of real context word + minimize prob of random wordsHuffman tree: build the Huffman tree: merge the two nodes with minimum frequency repeatingly Global Vectors for Word RepresentationGlobal statistics (LSA, Latent Statistic Analysis) + local context window (word2vec)Deep contextualized word representation, AI2 & Univ. Washington, 2018context dependent embeddingssentence -> embeddings of words in the sentencesBidirectional LSTM

凯利公式

Wed, 28 Jan 2026 12:09:53 GMT

用于求解最大化期望投资回报率的仓位。赢的概率是，输的概率是，赢了赚乘以本金，输了输乘以本金，每次应该投入多少比例的资金？开始的资金为，每次投入比例的资金，玩次后剩下的资金：可化为：最大化每次的期望投资回报率，令，得到或赔率为的时候：即把握越大，赔率越高，仓位越高。在某个时间区间内，价格先达到原来的倍就算赢，价格先达到原来的倍就算输，那赔率就是止盈止损价之比，而我们的仓位就是场内外资金之比。每次到达赢/输条件，若继续下注就自动将仓位调整至最佳仓位，否则提现退出。在股市中应用凯利公式还需注意：每一次结算后操作要根据市场动向调整自己的赔率和输赢概率先前在2025-11-24 爆竹投机心得 > 交易成本总结出的交易成本来说买入就是亏损%1。而且还要知道由于一些股票每股价格非常昂贵，没有办法很好地控制仓位比例。因此凯利公式现在适用于便宜的ETF基金。胜率的计算很玄学：没有人能预测股票方向，除非有内幕消息。但一般来说：
你比市场的近期短线投资者的成本低胜率就高于比市场上短线投资者胜率低的时候绿柱建仓
上涨的大周期开始比末尾胜率高上涨周期末尾的危险
上涨的大周期前期不充分的回调胜率比刚上涨就超涨胜率高上涨后等下跌早期爆量的胜率和赔率都比较高赔率的设置也有技巧：你要的赔率越高，胜率越低，时间越长止损价过高的时候胜率很低止损价过低的时候胜率很高，但时间被拉至几乎无限长（不利于复利）股市中的设定赔率后胜率都是变化的股市里面的“赌博”不是固定时间结算的——时间长短依赖于你的止损止盈

Lesson 2 Language Modeling

Wed, 28 Jan 2026 12:03:59 GMT

Statistical Language Model Neural Language Model EvaluationInfinite monkey theoremA language model is a probability distribution determine whether a given ordering of words sounds line natural language.A language model:language modeling: a sub-component of many NLP tasks.Task: given a sequence of words, compute the probability distribution of the next word on the Vocabulary .Markov assumption: depends only on the preceding words. Approximation: P/P = Counting / CountingProblems & solutionsthe n-1 grams never appear -> backoff: take n-2 gram instead. the n grams never appear -> smooth: + delta long tail: n usually cant exceed 5.hard to distinguish multi-word phrases.the probability decides the chance to be picked.the probability of the next word is conditioned on every history. word embeddings Fixed window is too small No symmetric in how the inputs are processed. Recurrent Neural Networks short term memory is generated from previous memory and current input: current output is computed from the current memory: SwiGLUCompute output distribution for every step and Loss function: since the true distribution is one-hot, and for the entire corpus, the loss is the average of the cross entropy: Back Propagation: gradient vanishing/explodingLSTM: input gate forget gate output gate new content Combining new content and input gate, Combining previous state and forget gate, Combining both of them to generate new current state. generate new hidden state with the output gate and current state hidden state is the short term memory is the Long term memory long term thread: generated by input and forget gate and new content short term thread: generated from output gate and long term memory gates are computed by the hidden state and current input. information use tanh (-1, 1), gate use (sigmoid) (0, 1)GRUperplexity: normalized product of inverse of next word probability, showing the confidence of the model.one example:

WCCI 2024 MMO

Wed, 28 Jan 2026 11:01:59 GMT

Empirical Risk Minimization 用参数配置好函数，给定一组输入输出对，使用损失函数对比预测值和真实值，计算损失向量，综合出数据集整体平均损失，并加上Regulation项，得到整体的损失损失函数应当是可导的，从而能使用反向传播算法沿梯度下降损失。可导的损失函数和正则化项MOO： : different optimization objects. Goal: find a multiple Pareto-optimal solutions Pareto-optimal?

CS5491 AI Lesson 3 Constraint Satsifactions

Tue, 27 Jan 2026 20:07:22 GMT

Approach: draw a problem as a graph Constraint Graph: nodes are variables, arcs are constraints General-purpose CSP algorithms use the graph structure to speed up search.TasmaniaBinary CSP: each constraint relates two variablesProblem: Put N queens on the board. Constraint: No two queens can kill each otherFormulation1:Constraints: No two queens are in the same row: No two queens are in the same column: No two queens are in the same diagonal: No two queens are int he same diagonal(2): All queens are placed: Formulation 2: Variables: : column number of the queen in row Domains: Constraints: Discrete Variables Finite domains NP-complete Finite domains Size means complete assignments Booleans CSP, including Boolean satisfiability (NP-Complete)? Infinite domains Linear constraints? solvable, nonlinear undecidable Continuous VariablesLP methodsMore than polynomialIf a problem is easy to be verify, then it's easy to solve. Varieties of Constraints: Unary, Binary, Higher-order constraints involve 3 or more variables Preferences (soft constraints) solvable if the constraint is convex. leads to constrained optimization problems Scheduling problems Timetabling problems Assignment problems initial state: the empty assignment: goal test: do all variables have been assigned with a value? Different Searching Algorithms for CSP: BFS: BFS: Bread DFS: at least work slightly better a basic uninformed algorithm for solving CSPsOne variable at a time variable assignments are commutative: ordering does not matter. check constraints as you go: consider only values which do not conflict with previous assignments. Variable Ordering: Minimum remaining values (MRV): expand the variable with the fewest legal value left Why: "Fail-fast" ordering Value Ordering: Least Constraining Value: choose the one that rules out the fewest values in the remaining variables. Forward checking: propagates information from assigned to unassigned variables. consistent: An arc is consistent iff for every subproblemIf the constraint graph has no loops, the CSP can be solved in Feb 24, 2026, 11:59 pm HKT One Page Proposal Mar 31, 2026, 11:59 pm +1 month project milestone Apr 26, 2026, 11:59 pm HKT final project submit

成像原理

Thu, 22 Jan 2026 06:47:14 GMT

HDR

Wed, 21 Jan 2026 23:07:35 GMT

和包围曝光的区别：HDR是图像亮部暗部拼接，包围曝光是叠加

运动模糊

Wed, 21 Jan 2026 23:02:37 GMT

di ku

快门

Wed, 21 Jan 2026 23:01:17 GMT

机械卷帘快门：前帘落下，CMOS清空电荷，前帘打开，CMOS开始曝光，后帘落下， CMOS结束曝光，CMOS数据读取完成，后帘打开复位。电子快门：逐行曝光，延迟一段时间后逐行读取，延迟的时间是曝光时间，读取速度决定快门速度电子前帘快门原理：电子前帘开启曝光，机械后帘遮光后读取，能够分离曝光和读取阶段，利用机械后帘的速度优势，减小果冻效应。劣势：两个快门不在同一平面，后帘在焦平面前，而前帘在焦平面，因此在快门间隔很小的时候，光斑会被裁切焦外弥散圆：由于前后帘不在同一平面，前帘在会遮挡底部的前景弥散圆高快门的效果：进光量低，能够在大光圈下抑制高光模糊少

景深

Wed, 21 Jan 2026 22:59:12 GMT

浅景深就是要在像平面上产生足够大的弥散圆大光圈（高F值）：高F值意味着进光角度变大，弥散圆更大长焦：长焦屈光能力弱，弥散圆更大近物距：物距越近，要达到大小的弥散圆需要的物距差就越小，因此景深浅。使用浅景深：大光圈搭配短快门保证进光量，一般也是更加聚焦，近距长焦就适合，太大的话远距长焦也可以。使用深景深：小光圈搭配长快门保证进光量，一般也是要拿更多信息，远距广角就适合

焦距

Wed, 21 Jan 2026 22:49:19 GMT

Focal Length焦距被定义为透镜中心到光线聚焦点的距离。易混概念：「调整焦距」和「对焦」是两种不同的操作。「对焦」是调整镜头的透镜组，从而调整整体折射率，使得物像能够清晰的呈现在传感器上。放大：在透镜大小不变的情况下，焦距越长，能通过透镜折射到相平面的光线角度就越小，因此拍出的相片大放大背景：拉开与主体的距离，使得主体覆盖的视场角度和近距离短焦的视场角，对于相同主体，由于背景的视角变小，因此背景整体看起来像是被放大了。浅景深：由于高f值意味着低屈光能力，因此与对焦物相距相同距离的物，在传感器上的弥散圆在高f下会比较大，由此景深也会比较浅。考虑对焦无限远的物体，即收集平行光线到焦平面上。相机的允许弥散圆直径是固定的，因此对比起广角镜头，长焦对焦无限远的物体时，弥散圆的透镜成像公式：即焦距倒数为像距倒数和物距倒数和。“Pasted image 20260122051219.png” could not be found.相机的对焦是在焦距不变的条件下，根据物距调整像距，让感光器落在要成像的物体的像平面上，而变焦则是改变透镜组的焦距人眼的对焦是在像距不变的条件下，根据物距调整焦距。焦距随着物距增大而增大，，因此看向近处的物体时，焦距变小，此时可以发现远处的物体也开始变得紧凑，这正是焦距变小导致视场变大的结果。真性近视是因为像距变长，远距离的物距需要的焦距也变长，超过了晶状体最放松状态下的焦距，因此需要凹透镜来“帮忙延长”焦距；假性近视则是因为晶状体紧张，无法放松到更大的焦距。老花则是因为晶状体弹性变弱，焦距缩短变难，因此需要凸透镜来帮忙屈光。简单来说，真性近视是：焦距调节范围与需要范围错位；假性近视和老花是：焦距调节范围变小。

Pasted image 20260122050914

Wed, 21 Jan 2026 21:09:14 GMT

光圈

Wed, 21 Jan 2026 20:24:38 GMT

光圈的表达式为例如。这描述的是光圈的绝对直径大小，是焦距。 F-number 可以理解为焦距和光圈直径的比值，即打到中心焦点的光线的角的正弦值的倒数可以理解为光圈相对于焦距的比例。光圈和光圈的进光量比值为。因此比要多进倍光我们说什么「大光圈」「小光圈」实际上说的不是光圈的直径，而是焦距和光圈直径的比值「F-number」。大光圈意味着小F值，意味着折射进镜头的光角会变大，相同距离产生的弥散圆会变大。光圈从小调大后，原本小于传感器像素的弥散圆现在会大过传感器像素的大小，覆盖很多像素，成为光晕；而原本小的光晕则变成更大的光晕。

ISO

Wed, 21 Jan 2026 13:20:06 GMT

ISO值衡量相机对光强的敏感程度。相机感光过程：浮动扩散节点FD（Floating Diffusion）将电子转化为电压信号，源极跟随器SF的栅极接受来自FD的电容的电压信号并接力给后续的电路模拟放大电路PGA（Programmable Gain Amplifier）按ISO/基础ISO（通常是100）的比例放大读出的电压信号，交给模数转换器ADC（Analog Digital Convertor）。 ADC将强度相近的模拟信号归到一个水平的数字信号中因此相机提升感光度的方式有：增加电子-电压增益提高PGA放大倍率放大数字信号由于相机提升感光度的策略一般基于ISO的数值范围，因此也产生了几个概念：基准ISO（Base ISO）：相机传感器未经放大下的感光度。原生ISO：经模拟电路放大的叫原生ISO，放大的是模拟信号，约在100-12800 扩展ISO：通过放大数字信号来提亮画面，机内提亮和后期提亮是同一种操作，没有区别。不同相机的感光度提升机制不同，因此ISO的具体意义是由语境定义的：一般的数字相机：增加感光度就是在放大信号，低感光度放大模拟信号，在模拟信号放大倍率到上限的时候放大数字信号。双转换增益Dual Conversion Gain相机：这种相机有两个电容，也就有两个基准ISO，在光线充足时使用低增益电子-电压转换，能够容纳更广的光强范围；而在低光环境下，使用高增益电子-电压转换，从而能够避免使用PGA模拟放大，从而避免所有前端噪声都被放大。选择更高的ISO是在命令相机提高感光度，对DCG相机而言，ISO提升到超过一个阈值就会选择使用高增益电路放大电子信号来替代PGA的模拟信号放大。ADC接受的电压是有上限的，接受到的电压超过这个上限就为过曝。因此ISO放大倍率过大动态范围是一个信号概念，是最大信号值/最小信号值的对数，用分贝表示。在相机中，就是满阱容电子数量/读出噪声电子数量的二对数，曝光指数EV（Exposure Value）则由读出电子数/读出噪声电子数的二对数计算得到。提高ISO：底噪被拉升，ADC接收到的有效电压范围缩小，即最大电压不变，最小电压提高，因此动态范围缩小。进光量：传感器实际接收到的光量，由快门和光圈决定。高快门不易糊，小光圈景深深。ISO的作用：让ADC能够更好地解析欠曝照片。由于ADC的分辨率是有上限的，如果进光量不足，导致传感器模拟部份读取的图像动态范围不足，会导致ADC无法很好的分辨电压变化，导致不同的模拟信号被解析成相同的水平的数字信号。因此提高ISO能让欠曝的相片被ADC更好地解析。对比后期数字放大：虽然提高ISO也会放大噪声信号，但是相比后期数字放大，由于ADC能够更好解析。图像的细节更丰富。而且不会有ADC噪声被放大的问题。如果进光亮够大，那么直接使用原生ISO，如果太大，在原生ISO下都过曝，则需要减小进光亮。阻抗：电路对交流电的总阻碍，包含电阻和电抗两个部份，电抗又包含容抗和感抗两个部份，容抗来源于电容对变化电压的阻碍作用，感抗则来源于因电流变化产生的自感电动势。增益：输入与输出的比值动态范围：信号最大值与噪声最大值的比值的10对数，单位为分贝dB曝光指数EV：EV用来指示相机的进光量，EV0等效ISO 100，光圈f/1.0下的进光量。EV补偿则是用来调整相机的自动进光量决策。大光比

T2

Wed, 21 Jan 2026 11:22:15 GMT

https://gpt-tokenizer.devtoken -> embeddings50257 total number of the tokens 768 length of the embedding vectors

相片格式

Wed, 21 Jan 2026 10:22:11 GMT

RAW：所有传感器信息 JPEG：有损压缩 TIFF：RAW信息+编辑信息

相机不同档位

Wed, 21 Jan 2026 10:15:54 GMT

傻瓜模式 auto档：所有参数相机决定如果需要调整白平衡，ISO，曝光补偿： P档程序自动，相机自动控制进光量（光圈和快门） A档光圈优先，由用户决定光圈，相机测光决定快门速度 S档快门优先，由用户决定快门，相机测光决定光圈 M档全手动光圈：大光圈浅景深快门：低快门易糊白平衡：温度低中高对应红色白色蓝色。温度是白色点平衡点，让橙色发白降低温度，让蓝色发白升高温度。

CS5491 AI Lesson 1

Mon, 19 Jan 2026 15:13:03 GMT

CAPTCHA: Turing Test "Right" thing: Logic, Economically Utility of outcomesmaximizing your expected utilityKnow how to Build a rational agent Rational Logic Utility of outcomes MAXIMIZE YOUR EXPECTED (calculated, learned) UTILITY Agent-Environment model Agent: perceive and act Environment: updated by act and feed back general, fundamental basis of AI What kind of technique to use to solve the problem? grounded, rather than hype rational agents are: not omniscient: know everything 全知全能 not clairvoyant: predict future 先知 explore and learn: essential qualities required in unknown environments not necessarily successful, but are autonomousDescribe a rational agents: PEAS Performance Measure Environment fully / partially observable single / multi agent deterministic / stochastic Discrete / Continuous (Countability) Actuators Sensors Planning Agents: Generate complete, optimal plan offline, then execute Generate simple, greedy plan, start executing, replan if necessary search problem: a state space a successor function (actions, costs) a start state and goal test (finish or not) a solution is sequence of actions which transforms the start state to the goal. Search Problems Are Models: actual problem: high-resolution model: good approximation Traveling in RomaniaWorld State: Includes every last detail of the environment the size of the world could be big in auto-trading? Search State: keeps only details necessary for planning piazza registration survey piazza questioning https://multiobjective-ml.org python review notebook Dear professor, I’m asking for the recommended reading materials about multi-objective optimization and suggestions of my research career.During the class, I learned from you that multi-objective optimization is about the scene with multiple conflicting optimization targets like auto-trading. I’m wandering if there’s any recommended introduction materials for that.In my point of view, auto-trading is a quite challenging topic in that many human investors can’t make sustainable profits from the stock market. And it seams like a complex inter-disciplinary topic. By the way, I’m looking for opportunity that starts my research career. Do you think it’s wise to start my research career with financial computing or just keep it as a hobby?Looking for your reply. Thanks!

CS5487 ML Lesson 1

Fri, 16 Jan 2026 13:00:34 GMT

Building Blocks linear algebra / calculus / probability & statistics information theory optimization theory https://xkcd.com/1838/Identify, Implement, Analyze, Designassignments: theory programming Final exam: problems are subset of the course packetinput: event Probability mass function -> probability Probability density function -> likelihoodprobability is bounded by 0 and 1, but likelihood can greater than one.Bernoulli distribution (Coin) Parameter: Probability of Probability of : Probability of : Poisson Distribution: # of arrivals over a fixed time period is the number of occurrence, is the expectation limit of binomial distribution, as growth, keeps unchanged. Gaussian DistributionCentral Limit TheoremSum random variables converges to Gaussian for large Join distributions Marginal Distribution marginal: written down in the marginal of the paper. Conditional Probability Statistical Independence , joint = product of marginal distribution Bayes' Rule: we can "invert" the condition from "given y" to "given x" how to memorize: denominator: normalization constant so Expectation: on average what value of would I see? weighted sum of Mean: Variance: covariance: vector vector operations inner product: , "similarity" sum of element wise product norm distance: outer product , "combination of elements" element-wise product matrix matrix vector operations matrix vector product: linear combination of cols of A with coefficients in x. Weighted sum of vectors vector of inner products between columns of A and x similarity of columns of A and x matrix matrix operations multiply Weighted sum of A, with weights vectors similarities between columns of A and columns of B , sum of outer products joint distribution: mean vector: , is a scalar here. covariance matrix: , expectation of outer products Multivariante Gaussian distance: is a Positive, Symmetric Matricesdeterminant: = "volume of Gaussian"Special Case: Independent: Diagonal

Lesson 1

Fri, 16 Jan 2026 12:26:22 GMT

project: 25% research driven midterm: 15% final: 60% Object modeling Transformation Projection and Clipping Hidden Surface Removal and Shading, The Rendering Pipeline Ray-Tracing and Radiosity Quiz (in-class) Aliasing and Antialiasing Real-time rendering GPU Architecture, Computer Animation Neural rendering, image-based rendering, Course Revision Course Project Presentation Computer Graphics: Scene description to Digital image Movies Video Games Computer-Aided Design Simulators Scientific Visualization Modeling, Animation, RenderingModeling2D/3D shape representation polygonal meshes subdivision surfaces splines geometry images Digital geometry processing 3D scanning Denoising Shape editing Simplification Parameterization Compression Animation Physics-based simulation Natural phenomena simulation Motion capture Animation transfer Rendering Photo-realistic Non-photo-realistic 2D/3D Shape RepresentationsPolygonal meshes Spline surfacesSurface ReconstructionDynamic Surface ReconstructionSimplificationMorphing

CS6493 NLP Lesson 1

Wed, 14 Jan 2026 13:51:33 GMT

Computer Sicence + Linguistics.Linguistics: sounds Phonetics Phonology language structure Morphology Syntax meaning semantics pragmatics Focus: semantic, syntax, morphology Basics: linguistics, language models, word embeddings NLP Tasks: Focus on: Understanding and Generation Tasks. Machines learning, question answering, dialogue, text classification Second Part: LLM: Transformers Pre-trainning/Finetuning Techniques Prompting, alignment, efficient fine tuning, agents, RAG. Q&A: Canvas/Email/TAs 60% Continuous Assessment + 40% Final Exam 1-8 students Midterm & final reports + codes + others Presentation Weeks 13 10-min pre + 2-min QA 1940s Machine Translation Warren Weaver Memorandum 1949 https://aclanthology.org/www.mt-archive.info/90/MTNI-1999-Hutchins.pdf1950s Turing test Noam Chomsky's Syntactic Structures revolutionized Linguistics with "universal grammar" Preprocessing of data remove stop words: few information Stemming: chopping the word to its basic form Lemmatization PoS (Part of Speech) tagging N-grams: uni/bi/tri-grams, N-sized moving-window grouping of the words in the sentences. Textual data vectorization: to number Bag of Words a word appearing counting vector TF-IDF: Term Frequency-Inverse Document Frequency Term Frequency: term freq in the sentence, higher is more important Document Frequency: , higher is more important. low means "common words" TF-IDF score: , higher is better in this sentence.

Sem B 选课

Tue, 13 Jan 2026 14:49:03 GMT

https://www.cityu.edu.hk/catalogue/pg/202526/course/CS5491.htmnlp ml aiCS5491 Artificial Intelligence C01 T 12:00-13:50, T01 14:00-14:50 C61 T 19:00-20:50, T61 21:00-21:50 CS5487 Machine Learning: Principles and Practice C01 R 15:00-17:50 C61 R 19:00-21:50 CS5182 Computer Graphics C61 F 19:00-20:50, T61 21:00-21:50 CS5483 Data Warehousing and Data Mining C61 R 19:00-21:50 CS6493 Natural Language Processing C61 W 20:00-21:50 T61 17:00-17:50 T62 18:00-18:50 T63

Screenshot 2026-01-05 at 14.22.03

Mon, 05 Jan 2026 06:22:05 GMT

研究计划

Sat, 27 Dec 2025 19:26:29 GMT

image

Tue, 23 Dec 2025 04:22:53 GMT

海南板块

Sun, 21 Dec 2025 22:55:35 GMT

20251222 zeterance 别追，因为：除海南高速外，没业绩关注封关后一年的变化，参考振兴东北（没有落地的东西） 2026炒点：海南+。非海南本地的企业在海南的发展。王府井+中国中免，同仁堂难成大器：没有人才优势，没有大学（相比中国香港、中国台湾）资本能看到的：进出口，加工，旅游免税第一波炒作逻辑：能快速见到钱的

电煮锅

Sun, 21 Dec 2025 09:52:04 GMT

2025/12/21 周日 03:30 20摄氏度250ml 冷藏鲜奶 9只冷藏鹌鹑蛋常温冷水 20摄氏度蒸煮程序-5至6分钟-水沸03:44 91.5摄氏度全熟

CS5222 Computer Network(2)

Mon, 15 Dec 2025 20:33:50 GMT

CS5222 Computer Network(1)

Mon, 15 Dec 2025 17:13:12 GMT

Store-and-forward model: as package is fully transmitted at this node, it's fully received at next node (only transmission delay considered). so when the packet starts transmitting at and fully sent at , it arrives the second node and start transmitting at , so the total delay from node is 链路上有 N 个节点，节点发送速率为0 -- 1 -- 2 -- ... --- N-1 --- N end-to-end delay for a task T: the time interval between the starting time and the fully finished time. end-to-end of length packets with links, transmission rate: the first packet takes from node 0 to node 1, and takes from node 1 to the final node .Circuit switch network: cannot multiplex multiple path on a single link. one connection can use only one path FDM: Frequency division multiplexing single-link bit rate: connection time: file size : TDM: Time division multiplexing Possibility of user i is using the network: Required band with per user Max number of users supported: Possibility of k of n users use the network simultaneously:Combination and Permutation The permutation of a set with number is a factorial The permutation of elements of a set with number is The combination of elements of the n-sized set is Permutation/Combination is about counting the number of ways to fill a tuple. It goes like: : There are choices for the first element, then choices for the second, and so on. : I don't care about the order of the last elements. : I don't care about the order of the first elements A(4Mbps) --- 8msec --- B create the first packet: time to transmit the entire first packet: propagation delay: total: propagation delay of link i = transmission delay of the link i = the processing delay of the link i (node i) = no queueing delay , N is the number of links forward immediately without storing no and the transmission rate is the same on both input and output sides: It means only transmit once: Transmission time: The propagation time doesn't change: The total time is: queuing delay = queuing data in bits / link rate: DNS server: Root(for .org, .com, etc.) -> TLD (Top level domain for apache.org, bitcoin.org, etc.) -> Authorized( for httpd.apache.org, downloads.apache.org, etc.)Persistent HTTP, Parallel TCP connectionsnon-persistent HTTP: one RTT for TCP initialization, one RTT for object requestingTCP 3-way handshake objectives: exchange SYN (Synchronize sequence number) make sure the partner has receive the SYN number. Approach: client send its SYN number and wait for the server to recognize the server send its SYN number and recognize the client's using one single packet the client send recognized the server's SYN with data attached. Why designers decide not send data at the beginning: the connection might be unsecure: data leakage, privacy issue useless SYN wastes the server's memory network congestion caused by invalid handshake DNS -> HTML -> ObjectsTotal Time = Time for DNS + Time for HTML + Time for ObjectsPersistent vs. Non-Peresistent Persistent HTTP: open TCP once, and spend 1 RTT for each object Non-persistent HTTP: open TCP, fetch object, close TCP. Parallel vs. Non-Parallel Parallel: often used when non-persistent HTTP is used. The end-to-end delay of the task "distributing File with sized to clients with a maximum server upload rate , minimum client download rate . Then is bottlenecked by the upload link rate , or the minimum download rate Fluid Model: instead of thinking data as discrete packets, simplify this with a data stream flowing water through pipes model. Server is bottleneck: Client is bottleneck: distribution scheme: what does the server do, what does the client do.Conditions: server's upload rate per peer is , the server is the bottleneck. So the minimum time is: constraints: server must at least upload one copy client must download the copy at least bits should be uploaded: So the minimum distribution time is: N grows, but also grows average service ability: minimum download rate: upload capability: So the bottleneck is the average service ability. So the minimum distribution time is BitTorrent Tick-for-tac: Prioritize neighbors with top upload rate Optimistic unchoking: randomly pick one chocked neighbor and serve Chocked other neighbors. Sybil attack: make many malicious fake node (sybil nodes) to attack the network.Difference between UDP and TCP socket: UDP and TCP sockets are identified by the server program for sending data. UDP: UDP is connectionless, UDP socket can be identified with only destination IP and port. TCP: TCP has the built-in connection concept, so socket for specific connection, which is described by a 4-tuple. UDP demultiplexes the (dest. IP, dest. port) pair, and TCP further demultiplexes the pair with (src IP, src port). UDP does not discriminate clients in socket level, but TCP does. complement: used for "inverse" operation: . The inverted number should be compatible with addition operation:Note in addition of two n digit number, the carry is either 0 or 1 1's complement: . The number line is: . it's minus by many "1" mathematically. The addition algorithm is: for every digits, current digit: carry flag: (at least 2 digit is 1) if the last carry flag is set or the result is -0, add 1 to the result (to overcome the gap between -0 and +1) if both a and b is positive, and , fine , positive number overflow cannot greater than since if one of a, b is positive and the other is negative, these operation is safe the operation is actually equals to if , the flag is set and added to the result, (a - |b|) is kept as the result. if , the flag is not set and the result falls in the negative zone, it's seen as , which is the 1's complement representation of the negative number if both a and b is negative, if carry flag is always set, and if the last digit is 0: falls in a positive range, overflow the last digit is 1: it's 2's complement: . It's "minus" by mathematically. The addition algorithm can be simplified to: current digit: carry flag: (at least 2 digit is 1) the carry flag is dropped. The positive and negative overflow criteria is the same with 1's complement addition. divide the data into chunks (16-digits) and add them all (using the 1's complement addition describe above). Take the sum's 1's complement It's less powerful than CRC: it never fail when there's only 1 bit error It might fail to detect 2 bits error (because of the addition operation) Why use 1's complement? Historical reason. A compromise of efficiency and performance. What a reliable data transfer need to deals with: the packet might contain error digits: checksum and acknowledgement the previous packet might be be acknowledged: timeout and retransmission the packet might be reordered: alternating-bit protocol ❌ the packet might be delayed and arrive in the same phase of the next cycle: use multi-digit Sequence number with more digits GBN: a sliding-window retransmission protocol send_base nextseqnum | acked | not yet ack’ed, not yet sent| the rest | |_________ window ___________| sender: up to sent but unack‘ed packages. timer for oldest in-flight pkt (i.e. send_base). re-transmit all un'acked packets if the timeout. ignore all duplicate ack. ("future" ack is legal, "past" ack is ignored) receiver: only accept and ack the accumulated seqnum ignore all illegal ack: otherwise, drop and resent previous ack set timer for each packet in the window individual acknowledgement and re-transmission Maximum Segment Size: negotiated to avoid underlying fragmentation. Sequence Number: Byte-stream number of the first byte in the segment (not including the header) Warning GBN and SR ack the current package using current package as the ACK number, but TCP use seq number of next byte as the ACK number! receiver: ack: seq number of the next byte Cumulative ACK (GBN-style) Handling of out-of-order is not specified RTO: retransmission timeoutTCP RTO is computed in a exponential moving average way. estimate the and the by: or, or, Use calculator to write the equation and calculate all , , RTO column by columnTerminology: MSS: Maximum Segment Size cwnd: congestion window, in (MSS) ssthresh: Slow Start Threshold approach: sender increases transmission rate (window size), probing for usable bandwidth, until loss occurs how does loss detected? timeout? how does the transmission rate increase? the AIMD principle: additive increase: increase cwnd by 1 MSS every RTT until loss detected. multiplicative decrease: cut cwnd in half after loss. Note TCP slow start grows FAST! TCP Slow Start: increase rate exponentially until first loss event, then grow linearly. initialize cwnd=1 double cwnd every RTT until it reach ssthresh: ACK received Loss Detection: detected by timeout: indicating that: the network is frozen. reaction: set cwnd to 1, and slow start to threshold detected by three duplicated ACK (TCP Reno): indicating that: the network has some ability but might be suffering overloading. Fast-Retransmit: immediately retransmit the missing packet Fast-Recovery: cut cwnd in half + 3 MSS, then grow linearly (Congestion Avoidance) TCP Tahoe always set cwnd to 1 ssthresh is set to 1/2 cwnd (Fast-Retransmit starts directly from here + 3 MSS) Receiver use rwnd to inform the sender do not send packages more than this size. rwnd is the "available cache size" of the receiver receive window. If the persistent timer timeout, the sender will send a Zero Window Probe to probe the receiver window size. MTU: Maximum Transmission Unit Ethereum: 1500 Bytes IP: at least 20 Bytes header, 1480 Bytes left, require multiple of 8 Bytes for non-final fragments TCP: 20 Bytes header, 1460 Bytes left HTTP data is segmented by TCP, IP sends TCP segments. TCP proactively avoid IP fragmentation.IP header size If not specified, treat it as 20 Bytes. Total payload to be fragmented: 1600 - 20 = 1580 Payload per non-final fragment: ((500 - 20) / 8) 8= 480 Byte = 60 8 Bytes 1580 = 480 * 3 + 140 Number of non-final fragment payload: 3 Remaining Payload: 140 Input port is FIFO, and the targeting output port of the head of line packet is congested, all package in the Input port are congested. Problem solved: Single Source Shortest Path of a weighted Graph Strategy: Greedy Requirements: the weights should not be negative Core Equation: Bellman-Ford equation:Algorithm:S = set() P = map() // predecessor node D = map() for n in N: D.update(n, infty) Q = heap(N, lambda n : D.get(n)) D.update(source, 0) while not Q.empty(): n = Q.top() Q.pop() for m, d in E.get(n): if not Q.member(m): continue d_new = D.get(n) + d if d_new < D.get(m): P.update(m, n) D.update(m, d_new) add(S, n) RunNode achieving minimum is next hop, update routing table according to the Bellman-Ford formula. each node estimate the time to the destination each node estimate the time to its neighbor each node get distance of its neighbor to the destination each node update the shortest path and know the current shortest time estimated. Problem Solved: Find Single Source Shortest Path in a distributed / async way. Type: Dynamic Programming Idea: Every node maintain and update an estimated distance to y: rather than compute the true directly. local knowledge: Node received DV from neighbor : Node knows the link cost to every neighbor : Update: On receiving new from neighbor, node maintains by Bellman-For Relaxation: (N stands for neighbors) Periodically broadcast to neighbors network / device address schemaReserved: First address: network ID. Not for hosts Last address: broadcast CIDR: Classless InterDomain Routing Classless: arbitrary length of the subnet portion address = subnet + network format: a.b.c.d/x, x is the length of subnet scalability issue: over 600 millions destinations administrative autonomy: network admin wants controlAS = autonomous systems intra-AS routing: routers in same AS use same routing protocol gateway router: routers that has link to router in another AS policy vs. performance: inter-AS routing: policy > performance intra-AS: can have full performance scale: hierarchical routing RIP: Distance Vector OSPF: Open Shortest Path First, Dijkstra BGP (Boarder Gateway Protocol): the de facto eBGP: obtain subnet reachability information from neighboring ASs. (where can I go?) iBGP: propagate reachability information to all AS-internal routers (If you want to visit xxx, come here! It costs y!) allow subnet to advertise its existence ("I am here") 一传十十传百，层层转包代理 Use eBGP to Advertising Prefix: Advertising Prefix to Peer Promise it will forward datagrams toward this prefix Can aggregate prefixes in its advertisement ? Learn prefix from reachability info sent by other gateway proxy using BGP EDC = Error Detection and Correction Bits D = Data protected by error checking, may include header fieldsError detection is not 100% reliable single bit parity: can check odd number of bit error 2-dimensional bit parity: unless bits are flapped at all cross-points of a # or both the raw and column parity bits flipped 11010011101100 000 <--- input padded by 3 bits from the right 1011 <--- divisor 01100011101100 000 <--- result (the first four bits are the XOR with the divisor beneath, the rest of the bits are unchanged) 1011 <--- divisor ... 00111011101100 000 1011 00010111101100 000 1011 00000001101100 000 <--- the divisor moves over to align with the next 1 in the dividend (since quotient for that step was zero) 1011 (in other words, it doesn't necessarily move one bit per iteration) 00000000110100 000 1011 00000000011000 000 1011 00000000001110 000 1011 00000000000101 000 101 1 ----------------- 00000000000000 100 <--- remainder (3 bits). Division algorithm stops here as dividend is equal to zero. Just Shout As You Want! Transmission: Transmit as soon as it's ready Collision detection: no ack Backoff: wait a random amount of time (a random number decided by the number of retries) Watch your watch and shout at the beginning of a time slot! synchronized by time. Can only transmit at the beginning of slot. network efficiency: the probability that the network is being used without collision.suppose all participant in the network transmit with a probability ALOHA: efficiency = any one of the nodes success = best efficiency = Meeting: Do not interrupt othersCSMA: "Listen before speaking". No collision detection. CSMA/CD: collision detection CSMA/CA: Avoid collision through methods like random backoffs, ACK frames, and optional RTS/CTS （Request to sent, Clear to Send) handshakestitle: Project Deadline startDate: 2025-12-15 startTime: 20:49:00 endDate: 2025-12-16 endTime: 12:00:00 type: circle color: #57ff22 trailColor: #f5f5f5 infoFormat: {percent}% complete - {remaining} until {end:LLL d, yyyy} updateInRealTime: true updateIntervalInSeconds: 1 Tutorials Reviewed: 11/11

Bitcoin

Mon, 15 Dec 2025 09:56:49 GMT

moonpay双因子认证服务由TrustedCoin提供。它使用多重签名钱包，你拥有3个密钥中的2个。第三个密钥存储在代表您签署交易的远程服务器上。要使用此服务，您需要安装有Google身份验证器的智能手机。使用远程服务器时，每一笔交易都要收取少量费用。安装完成后，您可以检查和修改您的账单首选项。注意，您的资金并没有被这个服务锁住。您可以随时取回您的资金，这既不需要远程服务器参与，也不需要缴费，只要使用“恢复钱包”选项来输入您的钱包密语种子即可。下一步会生成钱包的密语种子。这个密语种子不会被保存在您的计算机中，您必须把它抄在纸上。为了抵御恶意软件，您可以先在断网的计算机上进行这个操作，然后再把钱包移动到联网的计算机上。

Pasted image 20251215053540

Sun, 14 Dec 2025 21:35:40 GMT

EC5001 Electronics Commerce

Sat, 13 Dec 2025 03:26:56 GMT

Session I (50%) Question1 XML/DTD Design App development and revenue Technical and legal aspect Question 2 Explanation and differentiation of different terms/concepts with no more than 2 sentences Session 2(50%) Question1 (Part I of the course)what are ecommerce: "exchange online" 1. Overcome the Long Tail 2. interactions 1. mobile commerce 2. social computing 3. e market 1. (Seller-2-Buyer) 4. web 2.0 5. social computing 6. social networks Business models and marketplace models: Revenue model Value Proposition Why us Why the product Problem? B2C e-tailer: sales community provider: hybrid revenue model content provider: subscription, pay per download portal: advertising transaction broker: transaction fees market creator: transaction fees service provider: sales of service, subscriptions, sales of marketing data, advertising Social Shopping: trust B2B Transaction: Spot buying, Strategic sourcing B2B Portal sourcing e Auction tendering / bidding Request For Quote controls (控制措施): Preventive , detective, corrective Computer Crimes Worms Trojan Horse Malware Doss Spyware Spam Phishing analyze and critical skills for eCommerce cases goal: external and internal environment macro environment: big picture PEST: Political, Economic, Social, and technological Micro environment: industry level Bargaining power of buy/seller Competitions: Threat of new entrants, Rivalry of existing firms Threat of substitute products of services Strategy: Cost leadership Differentiation Niche Growth Alliance Innovation Entry-barrier SWOT: internal: Strength Weakness External: Opportunity Threats Business Model Canvas Organizational Strategy Strategy Initiation analysis and value, core competences, fore casts, competitor formulation: opportunities, cost, risk, plan Implementation: detailed, short-term project planning, resource allocation project management Assessment of the strategy key technologies regularity, ethical and legal aspects emerging trends Part I: Information System

Software Engineering

Fri, 12 Dec 2025 13:35:55 GMT

Software Engineering has activities, and corresponding and deliverables and tools. Different orderings of activities form different SE models.SE models

Vision N Image

Fri, 12 Dec 2025 13:34:44 GMT

Cross Correlation：点对点赋权求和 Convolution：沿着主对称轴翻转权重坐标的点对点赋权求和高通滤波：高频信息低通：低频信息gradient: theta, strength, df/dx, df/dy sobel: laplacian: 齐次坐标：统一旋转、位移、缩放、拉伸等操作Harris 角点检测：计算所有的点的二乘改变量，二次方，矩阵，该矩阵描述了整体的变化强弱（不同方向都用这个矩阵计算）Sift 关键点计算：DoG 用于近似LoG从而减小计算量同时实现构建不同的Scale，每个Scale使用不同的Gaussian滤波，相减得到Differrence of Gaussian 图，可以近似Laplassian of Gaussian（二次导，又一个深凹） Peak检测：在当前、上下两个大小scale的3x3窗口共9+8+9=26个点上比较当前值与邻居值的大小，如果比所有26个点都大或小，那就是一个peak，需要进行描述描述的方式是找在当前scale找gradient大小方向，对他16x16的窗口的所有pixel都这么做，而且还要再合并一次pixel为cell，并用一个统一的直方图统计起来8个方向的多少。针孔相机模型：物体大小/物距 = 象大小/焦距。相机内参：缩放+位移相机外参：在世界坐标系中的旋转矩阵拼接位移矩阵投射矩阵=内参 ✖️ 外参视差 baseline：两相机中心点（光孔）的连线 epipoles：另一个相机看到的相机中心点（baseline和象平面的交叉点） epipolar plane先根据两个相机的内外参恢复相片的姿态（旋转），再在水平线上搜索相似关键点，计算视差d。深度可以根据比例计算为f·b/d。brightness constancy：光流的变化量和光在该点的强度的变化量相同—亮度不变原理（有点像基尔霍夫定律）这样就能构建一个方程来猜测光流向量。 gaussian pyramid corners laplacian of gaussian

Reviews

Thu, 11 Dec 2025 09:30:45 GMT

ADT - Operations - Axiom Define core operations for ADT. Publish Axioms for ADT as a reduction rule set.Axioms requirement: accessors/observers(constructor) -> reduced expr with value OR expose the constructor. if then else could be used.A scenario context of a software process Ask for process improvement a software engineering process model is a set of activities, techniques, models, and tools. Activities: whatever happened in the software lifecycle: including Collecting users requirements: Requirement Engineering Techniques: Use cases/user stories/use case diagram/meeting Deliverable: Requirement document Designing software: Software Architecture Techniques: UML patterns/principles/tactics Deliverable: software design document Coding/implementation: Code with Quality Techniques: Java/C++/framework/platform Deliverable: Code listing and test scripts testing Techniques: Unit test framework/debugger Deliverable: QA report Deploy Techniques: Standalone software/plug-in/app/web services/ Software at user site Maintain Techniques: Bug reporting/software repository Bug report & s/w release Configuration management Techniques: Version control / change management software Code change/patch/change history report Project management. Techniques: work breakdown structure / work scheduling algorithm Deliverable: project status and status tracking Best Practices: Different ordering of activities Waterfall model: step by step. completely produce the full set of deliverables of each activity before starting the next activity. Requirement-Design-Coding-Testing-Deployment-Maintenance V-shape waterfall model: test all before deployment (Waterfall + stagewise validation goal) unit-test, system test, user acceptance test. problem for all waterfall-like model: false sense of clear-cut phases of activities. unrealistic, costly to correct upstream problems. nothing is delivered before is all done. lesson learnt: fixing bugs at earlier stages is cheaper we would rather have a degraded subsystem rather than no system. Software Process Improvement (New Era!) Some stages can be done in parallel backward iterations: improve previous activities before creating more bugs caused by the problems of the previous defects. Prototyping: code before design and requirement engineering. So users can point their requirements based on the prototype rather than describing them abstractly. prototype preliminary version of the final product. buggy with features partially implemented Loop(coding, requirement engineering design) - Testing - Documentation - Deployment - Maintenance Incremental software development model: divide and conquer divide the set of requirements into subsets and implement them incrementally. loop (sub SE process) Spiral model: plan, do prototyping, and plan more systematic 4 steps for 1 iteration: determine objectives, alternatives and constraint evaluate alternatives identify risks, and resolve risks develop, validate and verify next-level plan next level Unified Process (UP) Agile Methods: fewer tools, activities, process steps, intermediate products XP: no PM Scrum: PM, less tedious process Hybrid of Agile with Waterfall A scenario context for technical debt Ask for the technical debt dimension and strategy What is TD: doing things in a "quick and dirty" way creates a TD. incurs interest payment: extra effort is a design or construction approach that is expedient in short term but cost more to do later than do now. example: glue code. fix it next release. the only effective way to reduce TDs is to refactor TD Dimension Type: Requirements TD Architectural TD Design TD Code TD Test TD Build TD Documentation TD Infrastructure TD Versioning TD Defect TD Intentionality: prudent vs reckless, deliberate vs inadvertent Time horizon: short-term: reactively, for tactical reasons long-term: proactively, for strategic reasons Degree of Focus: Focused Debt: intentionally incurred and managed Unfocused debt: no clear strategy or priority. no documentation, no communication, keep growing, no long-term planning strategy, not systematicly identified and managed. TD management: Identify, track, discuss Identify patterns: Schedule pression: unreasonable commitment Use a more flexible planning approach duplication of code lack of experience, copy-and-paste, poor design, pressure to deliver static analysis tool, pair programming Repay now, or track it if replay later (add runtime exception), add to backlog. get it right in the first time over-engineered Yet another case study is given Ask for addressing specific problems in the requirements elicitation and requirement engineering processesRequirement Engineering: find out and structure the functional and non functional requirements Multi-level concerns: industry sector, same type of software, current application results: requirement specification contract, starting point of design common mistakes noise, silence, over-specification, contradictions, ambiguity, forward references, wishful thinking RE process: 4 steps: elicitation specification hierarchical structure: decompose link to specific stakeholders validating the requirement validity: same requirements from multiple sources (the more links, the better) inspecting the specification w.r.t. correctness, completeness, consistency, accuracy, readability, and testability aids at different stagesa early structured walkthrough with customers prototypes of initial versions Test plan or unit testing User acceptance testing negotiation RE analyst: actions: analysis: to find ambiguity negotiation: to resolve ambiguity 4 basic positions: knowledge->subjective-objective(no idea), world-> conflict-order (can not be controlled by the analyst) functional (objective + order): social-relativism(subjective + order): guided by analyst radical-structuralism(objective + conflict): chose for either party neohumanism(subjective + conflict): bringing parties togather RE Elicitation activities: understanding the application domain constraints enforced by the environment: political, organizational, social ... identifying source of requirements People: stakeholders, users and expert (provide details) Things: existing system and operation process, existing documentation(report, manual, forms) analyzing the stakeholders stakeholders: people who have an interest in the system groups and individuals internal or external to the organization customers and direct users of the system identifying key user representatives and product champions selecting the techniques, approaches and the tools to use. several elicitation techniques are employed during and at different life cycle stages. Eliciting the requirements from stakeholders and other souces targets: establish the scope of the system investigate in detail the needs and wants the future processes the system will perform w.r.t. the business operations the major objectives of the business techniques interview to get the backgroun group: brainstorming Software ArchitectureAsk about the relationship between requirements and software architecture modularity: the style to decompose a module into a set of connected submodules use styles to address non-functional needsassign functional violation and fulfilment of design principles Code with QualityA piece of code is given. Ask for redesigning the codeCode with QualityA piece of code is given. Ask about metamorphic testingmetamorphic testing: output cannot be determined. use expected relation (MR) instead.Metamorphic Relations: should be general, should not introduce new (untested) function.A piece of code is given, Ask about delta debuggingdelta debugging: input or code reduction process. find the minimum crash causing input.Modern Code Review A workflow is given. Ask for workflow customizationSoftware ArchitectureAn architecture is given. Ask for architectural revision to address quality attributes architectural driverattribute driven design Identifying QAs as Driver Evaluate the choices

Big Data Tools

Wed, 10 Dec 2025 13:38:46 GMT

Kubernetes：多节点弹性伸缩 minikube/kind：用docker作为底层部署单机k8s节点 kubectl：k8s命令行客户端 Hadoop 生态 HDFS：存储层 MapReduce：计算层 Hive：SQL查询层 Spark：内存缓冲 Flink：流式消息 Datahub：数据血缘自动分析，数据定义，数据管理，数据状态监视 Airflow：代码定义Dag，airflow3分离了worker和核心模块，适合监控和定时运行批处理任务 Kafka：消息组件，实时接收并持久化消息 Snowflake： elasticsearch：

价值价格、市场、经济央行的思考

Mon, 01 Dec 2025 21:40:03 GMT

价值，在末世里，流通性很差的时候，食物这些东西的价值会比太平盛世高；对一个狂热的书画收藏家来说，一副画的价值会比一副表的价值高，对于一个狂热的程序员来说，这些东西都不如最新的技术值。因此价值是根据人的价值观、时间、情形而变化的，价值是主观的概念。结论：价值是主观的价值的多少即价格通过交易来衡量，而在一个市场中，市场的价格由众多成交价决定。成交是时间离散的，如果一个物品在市场上能够自由地交易，那就会有人以低于成交价的价格买入，以高于成交价的价格卖出套利，因此买家会压低价格，卖家会尝试抬高价格，最终价格回归价值。但是交易的时候使用的不管是什么，都是以物易物，都有量的比：A/B，如果有一般等价物，那就会以这个一般等价物来衡量价值，例如黄金、美元、人民币这样易存储、而且在市场内所有人都认可、流通性强的东西。人们的欲望就是占有更多资产，这些资产的价格一般由当地市场的流通性最强的东西（货币）来衡量。结论：价格是市场上参与者价值观的客观体现，价值的衡量依托于市场以及它使用的一般等价物。物品价值的一种衡量方法是，你愿意以多少其他物品，兑换这个物品。这就形成了一个巨大的1-n关系，计算所有价值，就形成了一个巨大的n-n关系，或称为你的“价值表”。但是你愿意不等于你能够，市场衡量价值的方法是市场价格，它是这个市场上所有人表现出的总体价值表，这个市场上的投资者、投机者和对冲者共同制定了这个价值表。而对于现在的流通性好的、大规模市场来说，流通性最好的物品是货币，所有的东西都以货币计价，因此可以以货币c为中间物，市场对a和b的价值比就是a/b = (a/c) / (b/c) 。所有的东西都由货币中转了。结论：市场的价格体系是市场上所有人的价值观的综合体现那么我们在交易的时候，如何衡量我们的交易是否成功呢？那就是货币衡量的价格。但是，不同的市场认定不同的一般等价物，例如在中国你没办法直接用美元买菜，在一个野蛮部落里你也无法用美元获取资源。所以我们采用哪种货币作为衡量呢？一种货币能够转换为另一种货币，但是货币在大多数情况下本身没有价值，要兑换为当下需要的资源。大部份市场认定单一的一般等价物，因此需要在不同的市场上交易的时候需要把货币换为当地货币。而即使是使用同样的货币，市场的价格体系也可以不同。结论：市场不同，价格体系不同所以要靠n-n表衡量我们获得的价值多少，几乎是不可能的：首先你要清晰地阐述自己的价值观，为所有物品标上价码，而且还要确认自己在哪些市场交易。而价值的多少衡量，本质是找一个数字，这正是货币的1-n的模式。因此货币可以直接拿来作为价值的单位，在衡量多少这个功能上，与任何其他单位是等效的。因此我只需要自己为所有物品标上加码，然后我发现这是不必须的——因为我只需要拿我拥有的货币多少作为我在这个市场上持有的价值即可。结论：货币天然是一种方便地用来衡量价值多少的工具市场间的交易：如果我有一篮子商品，赋予不同权重，它们的加权价格作为一个市场上的价格水平。对于全球配置资产的人来说，他们之所以全球配置资产，是因为对自己或对自己的事业来说有流通性，最终可以用来产生或换取其他资源，最终成为自己能够支配的货币。用哪种货币呢？我们相信汇率可以用来将所有货币串起来，但是汇率只考虑了国际服务和商品，不便当地交易的服务是不能方便地同时用两个价格计价，因此应当使用购买力平价，还要找一个最大最稳定的经济体来做锚定。但是如果我们仅仅用货币来做衡量的话，好像也会失真：毕竟如果你用解放战争时期的金圆券衡量你的财富，那你牛逼坏了——所有人的财富都在暴涨。但真是这样吗？因此最合适的方法是，制定一篮子商品服务，然后衡量购买力。衡量财富的最好方法是你持有的流通性好的资产能直接换取的这一篮子服务商品。但是这个篮子会随着时间推移因为新商品新服务和你的价值观改变而改变。因此似乎我们永远也无法准确地描述我们有多少财富。只能又求助于大众：说在当前大众观念下的这一篮子商品服务下，我们能换到多少个单位的篮子，这就是我们的财富总量。财富的客观衡量：在当前大众观念下的这一篮子商品服务定义的资源单位，我流动资产能经货币换取的资源量。如果我能够美国工作中国花，在美国我的工作价值以美元计价，当地的消费高企，而在中国我在汇市把美元兑换为更多的人民币，换取低价的当地商品服务，那爽爆了。也就是说要有这种相对市场上其他求职者的差异优势，就能赚市场的钱。三角套利：如果 , , 且那么存在套利空间，做以上三次操作就可以让钱变多如果，过一段时间变成了，那么，只有，即一人民币能兑换更少美元到能兑换更多美元到时候——即贬值的时候，才会对美元发生贬值。如果人民币贬值，那么赚人民币的中国厂家的美国消费者就可以用更少的美元支付，因此利好中国厂家竞争力。如果不幸人民币升值，那么赚人民币的中国厂家就会失去竞争力，但是还没汇回的款就拿到了更多的人民币。贬值利于本国生产，升值有利于外国投资。为什么会国际商品价格差这么多，而国内商品服务价格却差不多？假设没有国际贸易，每个国家的工人都要求月薪3000，假设现在有了国际贸易，从美国赚了很多美元拿回来花不掉，美国只赚走了一点点人民币，那国家一看进来了这么多外汇，国家给你全部换了人民币，而且贸易规模越大，国家还印了越多钱用来流通，下次美国人来赚走人民币的时候就能用人民币换走这些美元了。如果美国人从中国赚的没中国人从美国赚得多，那美元换人民币需求旺盛，人民币自然升值，国家可以顺势印钞稳定汇率。但美国就很难受了，由于逆差，大家都要把赚的美元换成人民币，那人民币就很贵，美国政府就不能印钞来稳定汇率，只能接受人民币升值，但是老中把人民币送给你换换美元，人民币汇率就可以稳定了。这样顺差的老中就可以通过投放还是不投放国际人民币来双向操纵汇率，手上就拿得到很多的美元和用美元换的美债，但是老美也很烂，美元美债都给你赚走了我用啥，于是就印钱印钱。特朗普上台后打贸易战封锁中国，贸易量缩小，结果老百姓手上钱花不出去，砸在手里，于是价格全部上涨，美元实际贬值了。中国卖不出货，货砸在手里，结果裁员、打折促销，裁员后人们消费又有限，市场流通的钱不够，于是钱变贵通缩了。人们找央行借钱，央行就印钱，等着人们还钱。然后a人赚了b的钱，还了贷款，又借了一笔更大的，b又赚了a的钱，还了贷款，然后又借了比更大的，于是a总觉得能从b哪儿赚更多的钱，就去找央行借更多钱，b也这么干，于是央行的资产负债表就越来越大，直到市场上交易量太大，经济过热，然后大家都其实消费不了这么多，然后有人暴雷，然后经济崩坏。但是央行不会让这发生，作为最大的放贷人，发现经济过热的时候，例如失业率0，生产的牛奶超级多，厂商打折促销，于是他就会调高贷款利率，提高全社会贷款成本，鼓励大家存款不花钱，让大家的大部份贷款躺在银行里。这样流通的钱就少了。而经济过冷的时候，他会把贷款利率降低，让大家来贷款，鼓励大家存款搬家进资本市场，这个时候流通的钱就多了，消费意愿上来，商品价格上来，工厂也马力开足。或者调节商业银行的准备金比率也能控制放贷的速度。因此央行就是扭水龙头，通过控制货币量来控制货币流通速度，从而让a和b觉得“今年生意好做明年扩张多一家店”或“今年生意不好做把关几个店还贷款吧”来调节经济活动剧烈程度。市场价格的变动是否剧烈取决于你的交易频次：如果你需要频繁交易年货，那么你就应该以很高的频率查看市场价格，就有“今天和昨天差不多”；如果你只是每年买下年货，那年货的价格波动就是“今年比去年贵好多”。如果你能准确分析猪产量周期，而且猪产量和猪肉价格正相关，且猪周期是按月为单位算的，那么你买卖猪肉期货的频率就应该以月来算，不然就有很多噪音要分析了。股票的价值会剧烈波动吗？不会，他只有内在的分红的价值和资产价格的价值，或者投机的价值，但长期来看容易分析的还是分红的价值和稳定成长的价值。股票的价格却会剧烈波动，因此股票的价格跌到低于价值很多的时候就要买入，超过很多的时候就要卖出。每天的事情都是一件件发生的，每天太阳都只会升起落下一次。项目要一周周的推进，新闻要每天的早上召开发布会发，业绩要一个季度一个季度的做，战争按年打，因此不同的题材有不同的交易周期。

日K策略

Sat, 22 Nov 2025 15:30:16 GMT

2025-10-14 技术分析

Tue, 14 Oct 2025 03:12:00 GMT

多头买法原理：止损线或者见顶卖出见底买入并设置止损线 A股小散不支持做空，但是庄家会杀做空的空头，这是潜藏机会 ⚠️做空和杠杆做多的风险：保证金制度下，急跌容易被强制平仓资产清零基本价值不变时，行情变化规律就是：阴久必阳：空头力量逐渐变小，形式逐渐利多，空头最后若决战失败，将潜伏等待或转化为多头阳久必阴：同理基本图形光脚就是往回拉，光头就是往下按长上影说明空头很强，但是也说明多头也不弱，主动性很强，只是暂时受挫柱体是单边行情：多/空头力量强盛十字星：多空力量平衡不能只看一根，最少要看两根周和日中间还有五个日粒度的级别，多根合在一起才能反映真实走势上涨/下跌后的十字星：多/空头力量由盛转衰，风险/机会开始出现反包乌云盖顶：空头庄家迎合多头主动拉高，并在高点砸盘出货，最后收跌一根大绿柱，意味着单边下跌行情开启阳线反包：多头击杀空头下跌后第二次十字星并没有创新低：总决战中多头力量战胜空头力量，空翻多，单边下跌行情结束模式猫猫头走法：单边上涨行情下高开诱多后下杀诱空，再上涨戳破前高杀空，立马暴跌排除第二波空头并杀多单边下跌后的大阳柱/上涨后的大阴柱：做多/做空力量彰显强大决心，若价格随后进入震荡期，最终必翻转，但中间可能经历多次假突破的大幅度多空杀选择高风险收益比：选择底部就是选择低风险，选择单边行情就是选择高收益。所以散户应该在下跌行情结束和上涨行情开始成立后入场最好的入场点是底部，其次是回调，承担风险才能获得收益，需要承担短期下跌的压力但是一定要为每一笔设定止损散户入场时机：空翻多见底时刻，并设置前低为止损价。否则散户因承担不了大风险，容易频繁操作导致交易损耗甚至被套牢/错失行情。最近一个柱的演绎：大阴/大阳：都是市场单边行情出现被攻击的信号高开阴柱：大概率光头线，可能光头大阴柱，也可能多头反击上吊线低开阳柱：大概率光脚线，可能长上影，也可能大阳柱，也可能战平十字星，看有没有新高/新低平开延续行情，需要看后续走势所以如果在单边行情对手盘刚开始攻击的时候就加入会产生非常大的风险，随时出发止损，因为不知道哪一方最终获胜。长周期均线作用：穿越判断牛熊三个均线长周期：120、144、300 两个严格条件：假破位300后反杀144、120:牛市继续假突破300后反杀120、144:熊市开始分析庄家意图平准基金？非大底、大顶则意图不明投机资本？割韭菜的，你死我活单边趋势行情：小级别潜伏上涨？主升浪、加速周线、日线、120分底仓建立：根据日线级别特征设立底仓，至少要有两个逻辑成立：周、日K牛市确立（均线上）底部反转，缩量回调未放量大阳柱突破震荡箱体：穿越boll中线然后设立止损条件单，并一天后再来看第二天有了可卖头寸后，就可以盯盘捕捉上车机会：从日线、120、30、15、5直到一分钟，检测是否有逻辑成立如果在某一级别发现强逻辑底部，勇敢上车，并设立止损在该级别下盯盘，走出趋势后可以切换到更大级别盯盘出现盘整，回到能吃三根绿的最低级别盯盘一旦可卖头寸耗尽，停止买入，安坐轿子不要一次买完：每买入一份，可用的可卖头寸就少一分：意味着一旦第一次买失误触发止损，后续再多机会也只能望洋兴叹所有强势上涨趋势都有潜伏涨、主升浪、加速三个阶段，只要还没到加速就可以择机开多头单

融资融券

Thu, 09 Oct 2025 20:40:22 GMT

融资融券保证金制度（上证交易所），每次融资融券使用的保证金应以保证金余额为限也就是说，保证金可用余额决定了你能融资融券的金额，投资者账上的现金+资产代表了投资者的偿付能力，但用于抵押的资产需要用折算率打折扣，而且账上抵押资产要有融券卖出的偿付能力。有了保证金制度后，融资融券变成了一种抵押贷款手段，它让投资者能够抵押自己的资产来换得资金或股份。券商会通过调整折算率来规避风险。例如2025-10-09日，多家券商宣布将中芯国际等9支高市盈率股的折算率调整至0，这一举动将清零所有在这些股票中获利的投资者利用这些股票浮盈的融资融券能力，从而促使他们抛售兑现以求恢复保证金余额。

2025-10-08 股市分析

Wed, 08 Oct 2025 19:15:31 GMT

红利股逐渐跌出性价比科创板股估值过高缺乏业绩支撑泡沫较大风险较高创业板仍具想象空间可能成为下一轮引擎沪深300蓝筹股可能被用来压指数反内卷：矿业、电车等行业反内卷初具成效下半年财报有望改善根据这个分析，结合福布斯esg50，给出接下来的A股个股和etf和港股etf操作推荐，资金量4w单按开盘价操作，波动更大，适合低开买高卖做短线，盘中高价出售

Trigonometry Functions

Mon, 16 Mar 2026 13:36:02 GMT

为什么用单位圆和直角坐标系定义三角函数用直角三角形定义的话角度被限制在0至，用角边和单位圆交点坐标定义则能轻松将三角函数的定义域扩展到任意实数值。关于我们现在不把它当作圆周率，而是平角的角度由平面直角坐标系单位圆与x轴正半轴交于点，以直线为一边，另一边与单位圆交于，且的坐标为，则角的正弦和余弦值为。换句话说，正弦函数和余弦函数将角的大小映射到的坐标分量上。的坐标也可记为Note 将所有常数写在前面方便后续套用改变运算顺序或消掉常数大小为的角与单位圆的交点与关于x轴对称，因此有：大小为的角与单位圆的交点与关于原点对称，由此有：大小为的角与单位圆的交点与关于对称，由此有：仅使用勾股定理，分锐角三角形、钝角三角形、直角三角形三种情况（钝角三角形证明部分需要利用诱导公式）可证明对于任意由点ABC围成的平面三角形，记角ABC的对边长度为，角A大小为，有：作图，作出大小为, , 的角与单位圆的交点，利用两个角度大小为的弓形的弦长相等得Note 这样的标号应看作一个复合函数在的值其余和差公式的推导不过是运用实数加减法运算性质、余弦差公式、和诱导公式由于正弦余弦函数和差公式的齐次性，可得的和差公式令，代入上面的和差公式容易得到：将倍角公式反过来就是半角公式：三角函数值的积可以化为三角函数值的和可以发现和差公式的右侧的两项是很容易消去的其中一项的，因此能够有积化和差公式：既然积能等于和差，那么和差也能等于积。换元：辅助角公式是用反正切函数和正弦和角公式，将任意角度的正弦和余弦值的任意线性组合化为一个单一的正弦函数：defined the and as a form of Tayler Series, with the heuristic of their geometric definition:

Series

Mon, 16 Mar 2026 13:35:56 GMT

Series is an infinite sum, or the limit of Partial Sumsum of numberssum of functionsGeometric Harmonic P-Series Alternating Power series: series of power functions: sum of powers of Tests: Ratio test Integral test Compare testTo proof the image of the function is the limit of the Taylor series, write the series in the form of the sum of partial sum and Remainder:Taylor Series: if we know enough of the function behavior on point , we can approx it around by the power series:Fundamental Theorem of CalculusIntegrate by parts Integration skills > integration by partsIntegrate by parts:Suspect that:Inductive ReasoningMaclaurin Series is simply Tayler series center at :Example: computing knowledge: So we know well about around , and thus can compute around .from .factorial import factorial DERIVATIVES = [0, 1, 0, -1] pi = 3.141592653589793 def sin(x: float) -> float: if x < 0: return -sin(-x) if x > 2 * pi: return sin(x % (2 * pi)) res = 0 for n in range(100): res += DERIVATIVES[n % 4] * (x ** n) / factorial(n) return res odd and even numbers:product is odd only when all factors are oddsum is odd only if there’s odd number of odd termsgiven two series, what operations are allowed?distribute calculating limitation w.r.t additionGiven:So:

Fundamental Theorem of Calculus

Mon, 16 Mar 2026 12:05:23 GMT

Riemann Integral

Sun, 15 Mar 2026 23:06:04 GMT

Riemann SumRiemann Integral and , as long as , then Then is the limit of Riemann sumThis is due to the definition of Riemann integral and linear properties of limits. Limits > Rules with Real Number Operations

Integration skills

Sun, 15 Mar 2026 08:09:15 GMT

sums of integral is integrals of sumsApply fundamental theory of calculus:same as applying fundamental theorem: always , no matter it’s written as , , or abbreviation of integrals Composition of functionsdifferentiate w.r.t. or : or simply:By applying Fundamental Theorem of Calculus:multiplyingOr simply:by using Fundamental Theorem of Calculus:for example:

Euclid Domain

Sun, 08 Mar 2026 18:45:58 GMT

Function

Sun, 08 Mar 2026 17:32:05 GMT

domain: the complete set of all possible independent input values codomain: the set of all potential output values range/image: the set of actual output values produced. aliases: onto function definition: image is equal to its codomainExamples: The indexing consists of a surjective function from onto a set of functions between two fixed setsA vector space is defined as a vector set with a scalar field and the vector addition and scalar multiplication defined on them, satisfying 8 axioms listed below. Often called -vector space or vector space of . vector set : the elements of are called vectors scalar field : the elements of are called scalars the binary operation: vector addition, the binary function: scalar multiplication, eight axioms: a vector space is an abelian group under addition: associativity: commutativity: identity element of vector addition: inverse element of vector addition: additive inverse a ring homomorphism from the field into the endomorphism ring of this group Compatibility of scalar multiplication and field multiplication: Identity element of scalar multiplication: Distributivity of scalar multiplication w.r.t. vector addition: Distributivity of scalar multiplication w.r.t. field addition: ( direct consequences of the axioms: : : uniqueness of additive inverse: if then then so so or assume that

Gaussian Distribution

Thu, 05 Mar 2026 20:41:06 GMT

alias: Error Distribution, Gaussian Distribution, Normal Distributionproposed by: German mathematician Johann Carl Friedrich Gaussto figuring out the distribution of error/observation bias sum up to zero (average of samples is the true value) if errors are independent Maximum Likelihood Estimation of the error:With the sample average assumption, we know can be :Solve from the differential equation:Let's represent the with and first:
Here we use the Integration of Gaussian functionApply the normality of : , so , so

Probability

Thu, 05 Mar 2026 20:40:08 GMT

Matrix Calculus

Thu, 05 Mar 2026 20:28:52 GMT

1-order Partial Derivative of -th input2-order Partial Derivative of input then partial derivative of output on input 输出的变化量和输入的改变量是相同方向还是相反方向说明是源还是漏输入的改变量导致的旋转的方向、大小说明旋转的方向和强度$$ \nabla \times F\frac{\partial a^T x}{\partial x} = a\begin{align} \left( \frac{\partial x^{T}Ax}{\partial x} \right){s} &= \frac{ \partial \left( \sum{i=1}^{N} \sum{j=1}^{N} A{i,j}x{i}x{j} \right)}{\partial x{s}} \ &= \frac{\partial}{\partial x{s}} \left( \sum{j=1}^{N} A{s,j}x{s}x{j} + \sum{i=1}^{N} A{i,s}x{i}x{s} - A{s,s}x{s}^{2} \right) \ &= \sum{j=1}^{N} A{s,j}x{j} + \sum{i=1}^{N} A{i,s}x{i} \ &= \sum{i=1}^{N} (A{s,j} + A^{T}{s,i})x{s}\end{align}\frac{\partial x^{T} A x}{\partial x} = \left( A+A^{T} \right) x\frac{\partial \log |X|}{\partial X} = $$

Multiple Integrals

Thu, 05 Mar 2026 19:42:28 GMT

在, 上定义的积分对于平面积分，上和和下和能拆解为不同方向依次累加，且处理每个方向的时候都能使用积分中值定理将累次积分夹在上下和中间，而重积分的和式也夹在上下和中间，因此重积分若存在必能用累次积分计算。又：如果黎曼和存在，那么上下 Darboux 和的极限都相同，因此：同理：题目：切奶酪 (The Cheese Wedge) 求由以下曲面围成的立体图形的体积：抛物柱面：（像一个弯曲的墙壁）平面：（底面，地板）平面：（一个倾斜的切面，像刀一样斜切下来）如果积分上下限不互相影响，且，则：构成的平行四边形面积为，又若有偏导数：The area of a plane defined by two vector is given by determinant in 2D or 3D space: 所以极坐标变换柱坐标变换球坐标变换

二项式

Thu, 05 Mar 2026 19:30:29 GMT

Expectation

Thu, 05 Mar 2026 18:56:31 GMT

For discrete random variable For continuous random variableLOTUS (Law of the Unconscious Statistician) Theorem:It's all about sign cancellingthe probability density function is defined as:given , to transform to probability about , slice .if is monotonically decrease in if is monotonically increase in :take derivative of , apply chain rule:and both cases generate the same expression:So for piecewise monotonic : this leads to LOTUS for continuous variable:the potential negative sign of also cancel with the potential flipping of integral limit caused by substituting variables in definite integral：For discrete variable it's pretty intuitive:Since:So:Given by LOTUS and linearity of Riemann Integralapplying LOTUS:In conclusion:Expectation is also distributable w.r.t addition:Given , Expectation is distributable w.r.t multiplication:

不等式

Thu, 05 Mar 2026 18:27:27 GMT

Integration of Gaussian function

Thu, 05 Mar 2026 16:22:43 GMT

Integration by substitution

Corelation

Thu, 05 Mar 2026 08:12:21 GMT

Variance

Thu, 05 Mar 2026 08:12:13 GMT

applying Expectation > Linearity and Expectation > Distributivity w.r.t addition:variance between two r.v.also:

Poisson Process

Wed, 04 Mar 2026 18:22:36 GMT

单次概率每次概率p，n次出现x次的概率当随机过程被称为Poisson过程：从零开始：无记忆：独立性：事件的发生互不干扰稀疏性：发生两个事件的概率几乎为0 Poisson Process: 随机事件在连续时间内发生的基础模型发生率：单位时间发生率计数视角：单位事件发生的次数分布（离散）possion分布间隔视角：相邻两事件的时间间隔（连续），指数分布等待视角：从零时刻到第n个事件发生所经历的总时间（连续）gamma分布事件在任何时间发生的可能性相同，衡量一定时间发生的次数，使用到达率作为参数，将该段时间分为份，每份发生的概率为，使用二项分布，并让The expectation of Poisson distribution is:The Variance of Poisson distribution is:到时间首次发生的概率/等待时间发生的概率/时间内发生次数为0的概率，单位时间到达率为等待时间，即在时间段内，事件发生的次数为0CDF就为：PDF 就为:欧拉构造的能够让阶乘在实数上连续的函数令令，再令两次事件发生的时间间隔：指数分布 Deduction: from several general to a more specific conclusion Induction: from special cases to general form 等待n个事件发生所需要的时间高斯积分

Polynomials

Wed, 04 Mar 2026 14:18:49 GMT

假设检验

Mon, 02 Mar 2026 19:21:06 GMT

Stochastic Process

Sun, 01 Mar 2026 07:46:31 GMT

Distribution

Sat, 28 Feb 2026 20:03:18 GMT

Decomposition

Sat, 28 Feb 2026 18:42:39 GMT

lower-upper decomposition下三角矩阵相乘还是下三角矩阵：如果，那么，此时恒有，因此，因此，是下三角矩阵对应的逆矩阵是：因此构成LU分解的L矩阵格拉姆-施密特正交化算法选一个向量初始化，然后正则化作为基向量选取没有选取的向量，减去在上一个基向量的分量，使得结果向量和上一个基向量正交，正则化即可重复以上过程，直到所有向量都被处理能够正交是因为：点积有线性法则：寻找能使得和垂直，解得因此能够构造每个步骤都是基本列操作，表示成矩阵方程：因此：R为上三角矩阵

Tensor

Sun, 22 Feb 2026 13:22:41 GMT

从最后的维度开始，如果维度不一致，且一个为1，一个为份，则可以将低维数据复制份对齐维度后操作。

Tangent

Sun, 22 Feb 2026 13:07:00 GMT

曲面曲线切线的几何定义：当割线的两点无限接近的时候，为切线。按照这个定义，割线方程为：而切线方程自然为：右式和方向的方向导数一致。对于一个函数，因为它有个自由度，因此它在维空间定义了一个空间曲面：假设有一由引导的、过点点路线，定义为：，对求导有：也即：当这个路线代表的是函数在方向上的切线的时候，。将法方向定义为与曲面任何切线垂直的方向，那么对于任意应当有：，因此在任意方向上（未验证）（未验证）对于任意等值路径因此垂直于等值面

曲面曲线

Sun, 22 Feb 2026 12:46:24 GMT

维的曲面或曲线本质上是点集，分别有个、个自由度。在三维空间里，可以定义为由一个、两个自由度定义的约束方程。一个自由度的是曲线：当曲面表示为显函数的时候，就明确了这个曲面由两个自由度控制。

概率收敛

Sun, 22 Feb 2026 01:37:12 GMT

依分布收敛：，依概率收敛：，几乎处处收敛：，写作：定义为累积分布函数逐点收敛：说的是随机变量序列越靠后分布越来越稳定写作：落在邻域的概率为1说的是随机变量序列越靠后落于半径外的概率越来越趋近于0almost surely收敛于分布的事件概率为一说的是随机变量序列收敛于某个确定实数的概率为1。

Central Limit Theorem

Fri, 20 Feb 2026 12:49:44 GMT

误差独立性小误差概率大：误差概率密度函数算术平均值原理：极大似然估计值应该是观测的算术平均值解微分方程：使得：积分：样本足够多的时候，样本均值的分布会呈正态分布

Measure

Tue, 17 Feb 2026 08:29:37 GMT

非负性：任何集合的测度都不能是负数：空集的测度为零：可列可加性：不同的测度： Lebesgue测度：区间的长度、区域的面积 Counting Measure Probability Measure Dirac Measure Determinant

n维空间

Tue, 17 Feb 2026 07:15:21 GMT

在三维空间里有，定义在维空间里:将多个向量叉乘，最终获得一个垂直于所有输入向量的向量，如何做？将叉乘定义为那么由行列式的展开，

线性代数

Mon, 09 Feb 2026 10:44:37 GMT

向量，向量空间矩阵，行列式线性映射线性方程组集合+运算 = 群环域两种等价定义：列视角、行视角向量空间：矩阵方程 A的各个列为各个向量，x为各个向量的权重矩阵乘法将定义为，则有如果加法满足结合律（任意画括号）和交换律（任意排序），则有 A的各个列为各个向量，B的各个列为不同的权重组合，C的各个列为对应的加权和将定义为矩阵乘法操作被分解为多个向量点乘操作向量组的线性相关、线性无关：是否存在向量能够被其他向量的线性组合表示一个等式可以有很多种解读：线性组合：对一个向量组，使用不同的权重，将他们加权求和，得到的向量就是这个向量组的线性组合线性变换：对一个向量，把它当作权重，加权求和不同的向量组，向量组就代表一个变换，将向量映射到另一个唯一的向量线性方程组：用多个一次方程组成的多元一次方程组作为约束，描述一个未知向量，称为线性方程组。对于线性方程组来说，每个方程的未知数是对齐的，方程的加减和缩放都是缩放的系数，即矩阵里的行，因此对表示线性方程组的增广矩阵进行初等行变换，不影响线性方程组的解。因此有了前向消元、后向代入的高斯消元法。高斯消元法（基于增广矩阵）：前向消元：从未选择的行里的第一行开始，选取第一个非0元素作为主元，使用初等行变换消去主元下所有的元，并将主元变为1。前向消元结束后，线性变换形成了一个阶梯形矩阵。后向代入：从最后一个主元开始，用初等行变换将所有主元上方的元消去。至此我们得到了两个行变换序列：前向消元的行变换和后向代入的行变换，记为，将其应用在上，就有，此时每个主元的未知量都被一个方程确定，其他自由元也由主元的线性组合描述。因此解的个数为：如果的阶梯形的零行比的增广矩阵的零行多，那就说明有这样的不可能等式，代表着任何都不能满足该等式，无解；否则都是有解如果的阶梯形没有零行，就没有无解的可能，一定有解；如果有解，但是的阶梯形非零行数量少于未知数的数量/列数，则解里面有自由元，代表着有无数解。否则只有唯一解很容易看出主元的行不能被其他行/列的线性组合表示（因为其他行在该主元列上是0），因此最后非零行组成的向量组是线性独立的。对上述的情况，我们可以用“秩”的概念总结归纳定义行秩为阶梯形非零行的数量，即阶梯形主元的数量行阶梯形转置后成为了下三角矩阵，下面空行变成了后面的列，这个矩阵相当于做了一系对应原来行变换的列变换得到的应用初等行变换消去所有“主元”下的元，调整新产生的零行的位置，就得到了一个阶梯形，该阶梯形的主元数量还是不变。由于列变换是可逆的，原来的主元正好错开一列，因此行秩等于列秩行阶梯把的阶梯形转置过来，也是一个阶梯形。矩阵的秩：如果不考虑增广矩阵，一个矩阵能够被行变换就是线性组合操作，如果一行可以被上面的行的线性组合完全消去，那这一行就可以被该组合表示。其线性变换的结果也可以被其他元的一个固定的线性组合表示，在线性方程组里，行秩：行阶梯形的主元数量根据高斯消元法知，方阵A满秩的时候等价于有唯一解（解集和相同，是的行阶梯形）基本行操作：等同于左乘一个基本行矩阵、基本列操作等同于右乘一个基本列矩阵即可逆等价于可逆？，可逆，可逆，则若不可逆，则不可逆，即不存在使得的标准型是蕴含可逆可逆则有唯一解，蕴含基本行操作可以让变为，等价的标准型为因此的标准型是等价于可逆矩阵运算意思是可逆等价于可逆行满秩有唯一解： 1到2 高斯消元法（基本行操作（左乘基本行矩阵）） 2到1 反证，A不满秩，有自由元，有无数解行标准型为 3到1 反向代回（行操作）化成标准型为，通过基本行操作得到的是，也满足行阶梯形的定义，因此行满秩 1到3 行满秩，则可以通过行操作将化为可逆： 3到4 的逆为所有基本行操作矩阵乘积的逆 4到2 可逆则有唯一解可逆：4与5 等价：矩阵乘法定义、矩阵转置定义、可逆定义行向量线性无关 3-6：的行标准型为说明，的行向量无法线性组合为0向量 6-3：反证，的行标准型含有0行，则的行向量可以线性组合为0向量，非零参数在基本行矩阵里定义列向量线性无关：等价于2的表述列满秩：使用上述结论等价于可逆，等价于列线性无关，等价于行线性无关，等价于行满秩行满秩等价于列满秩 A行向量线性无关等价于A列向量线性无关 n个向量线性无关意思为任意一个向量都不能被其他向量表示，即：只有唯一解：。这等价于有唯一解为0向量，这等价于行满秩方阵列向量线性独立等价于行向量满秩向量空间是一个集合，集合中的元素称为向量，可以相加和被标量缩放。加法和标量乘法满足向量公理。"a vector space over a given field" 的列向量张成的空间称为列空间：。的行空间: 的零空间：基：线性无关，生成律。 identity element: of vector addition: of scalar multiplication: distributivity: define: consequences: ：，两边加上的逆元得到，因此：，两边加上的逆元得到，因此：，因此的逆元是 F-vector space Vlinear subspace of vector space : 对向量加法和标量乘法封闭的非空子集子空间的交集也是子空间：, 由于都是子空间，，，因此也对向量空间运算封闭，也是子空间是的子集，的张成是包含的最小的线性空间，的张成是的所有元素的线性组合Basis：能张成的子集，且元素线性无关。Coordinate：n子集，为特征向量，是特征值特征空间：，且特征向量组成的矩阵可逆，则Trace 迹: 零空间/kernel：的解集，对向量加法、标量乘法封闭特征空间：对特定的，的解集，也构成向量空间 Function , Y: codomain: a set into which all of the outputs are constrained to be fall. image: 像 X: domain: the set of inputs accepted by the function G: graph: the set of ordered pair classification: surjection/ surjective function 满射: image is the codomain： injection / injective function / one-to-one function 单射： bijection / bijective function / one-to-one correspondence 双射/一一对应： bijective = surjective & injective if A is a triangular matrix 两点确定一条直线点积的几何意义：点积运算定义：由余弦定理：第三条边长^2=第一条边长^2+第二条边长^2-2第一条边长第二条边长第一第二条边夹角余弦值用向量表示第三条边，展开其模的平方作为边长，得到：第三条边长^2=第一条边长^2+第二条边长^2-2第一条边长第二条边长第一第二条边长第三条边长的点积（对任意n>2维都有）因此向量点积等价于向量模积乘夹角余弦值外积搜索外积的时候都是在讲外代数外积外代数矩阵加法结合代数张量积双线性运算协变向量空间外代数外代数是一个数学对象，由一个向量空间和一个作为乘法的操作组成，该操作称为楔积或外积。描述的是最一般的外积和向量空间构成的有单位的结合代数。欧式三维空间的外代数矩阵加法： direct sum / 直和：拼接对角分块矩阵 Kronecker sum: Kronecker Product: 为A的每个元素缩放B然后拼接成分块矩阵 Kronecker Sum：，结果和方阵B形状相同共 n取k 维，基为所有正序组合的k个基的外积关键词：代数余子式，伴随矩阵，逆矩阵拉普拉斯展开法说的是可以用代数余子式计算行列式（待理解：为什么？）：行交换行列式变号行缩放倍行列式缩放倍行相加行列式缩放倍因此有限次基础行操作后行列式都是倍数关系。如果某种变形为0，那所有变形的行列式都为0。这也说明了行列式的几何意义——线性变换在各个方向上缩放的倍数乘积——或矩阵列向量组成的“体积”的大小。另一方面说明了，如果可逆，那么必然有唯一解，那就是，意味着的阶梯形没有0行，意味着，因此是可逆的必要条件。很容易发现左边是原汁原味的的行，容易构造一个伴随矩阵，就有：如果是可逆的，且，那么显而易见：因此充分说明可逆。

Formal system

Wed, 04 Feb 2026 09:33:09 GMT

圆的大小

Wed, 28 Jan 2026 23:19:10 GMT

圆周率

Thu, 22 Jan 2026 08:15:42 GMT

顶角为高的等腰三角形面积为：底边长为：顶角为腰长为的等腰三角形面积为：底边长为：正n变形由n个全等等腰三角形组成，顶角360/n，圆内接正n边形周长：

抽象代数

Sat, 03 Jan 2026 04:50:13 GMT

群、环、域group theory集合+一个操作要求：封闭性结合律：单位元：逆元：推论：单位元的对称性蕴含单位元唯一性：结合律和逆元对称性蕴含逆元唯一性：意义：单位元项可以消去：操作a，e后还是a 所有项可逆：操作a，a逆后产生e 加上交换律得到交换群（阿贝尔群）指数运算定义由半群的结合律有：由群的结合律和逆元定义有：，定义为，此为负指数幂的意义：正指数幂的逆元结合逆元的定义，可以定义的意义为：0指数幂为单位元容易推出。自此有整数指数幂的规则循环群，生成元有限循环群同构于整数模n加法群无限群环群同构于整数加法群，生成元为1 实数加法是群：满足封闭性，结合律，有实数加法单位元0，所有实数都有实数加法逆元含零乘法不是群：0没有逆元能够产生单位元1 非零实数的乘法：封闭，结合，单位，逆元，是为群整数模n加法是交换群：集合：操作：整数加法的结果模n 由mod操作易得满足封闭性由整数加法易得结合律，单位元，交换律满足逆元要求：对于任意， ring theory表达式 expression，项term，因子factor集合+两个操作要求：第一个操作要求为阿贝尔群，称为“加法”，加法的单位元称为零元第二个操作要求为半群（不要求所有都可逆），称为“乘法” 乘法对加法要有左右分配律性质分配率蕴含： field theory要求非零元素的乘法构成对称群。零元素的乘法？

逻辑代数

Sat, 27 Dec 2025 19:29:34 GMT

和任何代数一样，使用变量来表示值，定义操作，发现并使用恒等式变形（化简or其他目的的变形），最后代入值求值。值只取布尔值（真/假, T/F, 0/1) n元逻辑运算是n元真值表的定义的。加括号 left to right: Priority: 恒等式： T, F 结合律：分配律：有用的快捷公式：蕴含操作定义：直观定义：推论（对真值计算会复杂化但是有助于理解）：应用两次: 逻辑变量等价操作基于蕴含操作定义定义：推论：

关系代数

Sat, 20 Dec 2025 12:30:58 GMT

A relationship is defined as a subset of Cartesian product of sets. Selection Projection Join : 数据表：行列（Record记录，Field字段）二维表，无重复行 Primary Key，索引数据的索引靠索引就能定位到唯一的一条记录；或者说有不同记录值的记录不能有相同索引。（避免维度灾难：找到索引字段，同一索引可能能索引出来很多Record，但是这些记录都很相似）关系矩阵：关系图：

集合代数

Thu, 18 Dec 2025 23:16:09 GMT

集合操作交并补差是由类似的集合语言定义的，实际上是用逻辑操作定义。因此除了，集合恒等式的形式和逻辑恒等式一样。因此可以完全照搬，其中把映射为，把映射为。对于，处理方法是用恒等式替代。包含操作、相等操作是由这样的形式定义的，因此可以容易从蕴含和等价的逻辑规则推出相应的规则：等价于，且有等价于 Cartesian Product:

Long Division

Fri, 05 Dec 2025 09:48:10 GMT

the long division factorize the dividend into and obtain the quotient by obtaining first to cancel a from the dividend and obtain the remaining part with : and keep doing the left.