合肥生活安徽新聞合肥交通合肥房產(chǎn)生活服務(wù)合肥教育合肥招聘合肥旅游文化藝術(shù)合肥美食合肥地圖合肥社保合肥醫(yī)院企業(yè)服務(wù)合肥法律

        代做INFSCI 0510、代寫 java/Python 編程

        時(shí)間:2024-05-26  來(lái)源:合肥網(wǎng)hfw.cc  作者:hfw.cc 我要糾錯(cuò)



        Coursework: Kernel PCA for Linearly-Inseparable Dataset
        INFSCI 0510 Data Analysis, Department of Computer Science, SCUPI Spring 2024
        This coursework contains coding exercises and text justifications. Please read the instructions carefully and follow them step-by-step. For submission instructions, please read the last section. If you have any queries regarding the understanding of the coursework sheet, please contact the TAs or the course leader. Due on: 23:59 PM, Wednesday, June 5th.
        PCA
        In our lectures, we introduced principle component analysis (PCA). Given a dataset X ∈ Rd×n with n data points of d dimensions, we are interested to project X onto a low-dimensional subspace, where the basis vectors U ∈ Rd×k are the principle components (PC), computed as follows:
        X􏰀 = U ΣV T , (1) where X􏰀 is the standardised version of X with zero-mean. Eq. (1) is called singular value decompo-
        sition (SVD).
        Based on the PC matrix U, the projection for low-dimensional features Z ∈ Rk×n, with k < d, is presented as:
        Z = UT X. (2) Compared with X, these low-dimensional features Z carry substantial information within less
        dimensionality, therefore favored for the learning task.
        Kernel Trick
        Besides the PCA process for dimensionality reduction, we also introduced dimensionality expan- sion in our lectures by change of basis. For a linearly-inseparable dataset X ∈ Rd×n, it is possible to find a hyperplane for the classification task with 0 error by transforming X onto a high-dimensional superspace. In this case, the classification task will be conducted with the transformed data, repre- sented as φ(X) ∈ RD×n with D > d, φ(·) denotes the transformation function. By projecting the hyperplane back to the original space, we can produce a non-linear solution for the classification task.
        However, recall from the lectures, such a change of basis may be computational expensive. To solve this issue, we introduced the kernel trick. Specifically, to perform the classification task for the projected dataset φ(X), we can use a kernel function K(·,·) that computes the dot product ⟨φ(xi),φ(xj)⟩ of any two projected samples xi and xj, presented as:
        K(xi,xj) = ⟨φ(xi),φ(xj)⟩, (3)
        where kernel function K(·,·) computes the dot product with the inputs xi and xj. Hence, such a dot product is calculated without explicitly computing the computational-expensive transformation φ(X). There are many kernel functions to use, in this coursework, we will focus on two types of kernels:
          1
        􏰀

        1. Homogeneous Polynomial kernel : K(xi,xj) = (⟨xi,xj⟩)p, where p > 0 is the polynomial degree.
        2. Radial Basis Function (RBF) kernel: also called Gaussian kernel, K(xi,xj) = e−γ∥xi−xj∥2, where
        γ = 1 and σ is the width or scale of a Gaussian distribution centered at x .
        Kernel PCA
        2σ2
        j
        Kernel PCA is a combined technique of PCA and the kernel trick, where we are still interested in using the PCA process to find the features Z ∈ Rk×n. However, the dimensionality of these features are now ranging from 1 to a large number D, i.e., k ∈ [1, D). The reason is because we first transformed X to a superspace φ(X) ∈ RD×n, then applying the PCA process to produce the features.
        Also, we would like to avoid the explicit computation of the high-dimensional φ(X), which can be done by involving the kernel function K(·,·) into the PCA process. Such a kernel PCA process of producing Z is not linear anymore, allowing us to find non-linear solution for classification task, which is very useful when solving a classification task on a linearly-inseparable dataset X ∈ Rd×n with a low dimensionality, e.g., d = 2.
        Dataset and Task Summary
        The dataset for this coursework is the Circles Dataset, a synthetic dataset widely used to design and test models. The dataset contains 500 samples varying in two classes, i.e., X ∈ R2×500. To load the dataset, please download the Circles.data file from the Blackboard. The data file is constructed by three columns of data: the first two columns represent the two features of X, while the third column denotes the class labels, i.e., class 1 or class 2. Try plot the dataset and see how the two-class samples are distributed.
        The task in this course work is using kernel PCA to transform the original dataset X ∈ R2×500 into a linearly-separable dataset Z ∈ Rk×500 with the minimum number of PCs, i.e., a minimum k value. To confirm if the dataset can be made linearly separable, we will use a very simple classification model, decision stump. The whole process can be divided into the following steps:
        1. Choose a kernel function with appropriate hyperparameter value.
        2. Apply kernel PCA on the original set X ∈ R2×500 to generate the transformed data Z ∈ Rk×500.
        3. Find the minimum number of PCs, i.e., the minimum k value required to classify all data points
        in Z correctly, using only one decision stump.
        The tasks to complete are elaborated into different exercises, which will be detailed in following sections. When solving these tasks, make sure to maintain the Circles.data file under the same directory with your code file.
        Exercises **3
        Exercise 1 (35 marks) :
        • Please use equations to mathematically prove how we can apply PCA on φ(X) without explicitly computing φ(X). (20 marks)
        • Please use equations to mathematically prove how to compute the transformed dataset Z, i.e., the projection, without linking to any computation of φ(X). (15 marks)
        Hint: recall how SVD works with φ(X), then link the SVD with the result of the kernel function, i.e., the kernel matrix K.
        2

        Note: don’t forget the standardisation procedure before the PCA process.
        Important: the full marks can be awarded to the following Exercise 2 and Exercise 3 only if the answers to Exercise 1 are correct, otherwise, we will only award 50% of the total marks to any following tasks that are related to the theories in Exercises 1, because we regard your code or any discussions in these tasks as those built from wrong theories, although they may be correct inside the task range.
        Exercise 2 (30 marks) :
        Based on the theories from Exercise 1, choose the kernel (Homogeneous Polynomial or Gaussian) and the corresponding hyperparameters that can be used in conjunction with PCA to produce a linearly-separable dataset Z. Implement the kernel PCA, and answer several questions to justify your selection, as follows:
        • Provide the code snippet with results to show your correct implementation of kernel PCA. (15 marks)
        • What kind of projection can be achieved with the Homogeneous Polynomial kernel and with the Gaussian kernel? (5 marks)
        • What is the influence of the degree p in a Homogeneous Polynomial kernel? (5 marks)
        • How can one relate the Gaussian width σ to the data available? (5 marks)
        Note: don’t forget the standardisation procedure before the PCA process.
        Note: you can use cross-validation to select hyperparameters, however, make sure that the selected
        ones are the most appropriate ones for the whole dataset.
        Important: there are ready-to-use implementations of kernel PCA in Python. You must imple- ment your own solution and must not use any such libraries, otherwise, 0 marks will be given to any related tasks. Your code from assignment 4 can be used as a starting point to complete this coursework. More specifically:
        Libraries that implement basic operations can be used in the coursework, for example: - mean, variance, centre data
        - plotting
        - matrix and vector multiplications, inverse, transpose
        - computation of distance, divergence, or accuracy - singular value decomposition
        Libraries that implement the main solutions operations must not be used in the coursework: - the linear version of PCA
        - the non-linear version of PCA, i.e., kernel PCA
        Exercise 3 (30 marks) :
        After the kernel PCA implementation and hyperparameter reasoning from Exercise 1, the next step is to build one decision stump that correctly classify all the samples in the transformed dataset Z. Please complete the following tasks:
        • Determine the minimum number of PCs required to classify all the samples in the dataset Z correctly, using one decision stump. (10 marks)
        • Please justify the metric used to fit the decision stump. (5 marks)
        • Provide the splitting rule and the accuracy of the decision stump. (5 marks)
        • Plot the visualization of the input data of the decision stump, i.e., the **D features. (5 marks)
        • For the transformed dataset Z, if the minimum number of PCs satisfies k ≤ 3, plot the visu-
        alization of the transformed dataset Z. Otherwise (if k > 3), simply state the incapability of providing the visualization by providing your results of k > 3. (5 marks)
        3

        Extras (5 marks) :
        Your code (.ipynb jupyter file) should be clearly and logically structured, any answers or discussions to the exercises should be well-written and adequately proofread before submission. A total of 5 marks are for the organization and explanation (comments) of your code, also for the organization and presentation of your answers or discussions in the report (.pdf file).
        Submission
        Your submission will include two files:
        1. A report file (.pdf) with all your answers or any discussions of all the tasks in Exercise **3.
        2. A jupyter notebook file (.ipynb file) with all your code and appropriate explanations to
        understand your code.
        Our marking process may help you structure your report and code:
        1. For each task in Exercise **3, we will look for answers from your report. Therefore, please answer all the tasks in your report. For any tasks that require any code snippets, please also attach them in your report, which can be done through screenshots.
        2. We will also run your jupyter notebook and see if your code can provide results that align with the answers in your report, especially. When checking for the last time about whether your code can generate the correct results, please remember to Restart Kernel and Clear Outputs of All Cells. As we will do the same to examine your code.
        3. Note that when running your code, we will place the Circles.data file under the same direc- tory with your jupyter notebook file. Hence, please do the same when testing your code, and avoid using any absolute path in your code.
        In the end, please compress the two files into a .zip file, and name the .zip file as: ”[CW]-[Session Number]-[Student ID]-[Your name]”
        For instance, CW-0**2023141520000-Tom.zip
        請(qǐng)加QQ:99515681  郵箱:99515681@qq.com   WX:codinghelp




















         

        掃一掃在手機(jī)打開(kāi)當(dāng)前頁(yè)
      1. 上一篇:香港到越南簽證多久能下來(lái)(香港辦理越南簽證流程)
      2. 下一篇:CSSE2010 代做、代寫 c/c++編程語(yǔ)言
      3. 無(wú)相關(guān)信息
        合肥生活資訊

        合肥圖文信息
        出評(píng) 開(kāi)團(tuán)工具
        出評(píng) 開(kāi)團(tuán)工具
        挖掘機(jī)濾芯提升發(fā)動(dòng)機(jī)性能
        挖掘機(jī)濾芯提升發(fā)動(dòng)機(jī)性能
        戴納斯帝壁掛爐全國(guó)售后服務(wù)電話24小時(shí)官網(wǎng)400(全國(guó)服務(wù)熱線)
        戴納斯帝壁掛爐全國(guó)售后服務(wù)電話24小時(shí)官網(wǎng)
        菲斯曼壁掛爐全國(guó)統(tǒng)一400售后維修服務(wù)電話24小時(shí)服務(wù)熱線
        菲斯曼壁掛爐全國(guó)統(tǒng)一400售后維修服務(wù)電話2
        美的熱水器售后服務(wù)技術(shù)咨詢電話全國(guó)24小時(shí)客服熱線
        美的熱水器售后服務(wù)技術(shù)咨詢電話全國(guó)24小時(shí)
        海信羅馬假日洗衣機(jī)亮相AWE  復(fù)古美學(xué)與現(xiàn)代科技完美結(jié)合
        海信羅馬假日洗衣機(jī)亮相AWE 復(fù)古美學(xué)與現(xiàn)代
        合肥機(jī)場(chǎng)巴士4號(hào)線
        合肥機(jī)場(chǎng)巴士4號(hào)線
        合肥機(jī)場(chǎng)巴士3號(hào)線
        合肥機(jī)場(chǎng)巴士3號(hào)線
      4. 上海廠房出租 短信驗(yàn)證碼 酒店vi設(shè)計(jì)

        主站蜘蛛池模板: 无码人妻一区二区三区免费n鬼沢| 韩日午夜在线资源一区二区| 日本在线视频一区二区三区| 少妇人妻偷人精品一区二区| 亚洲国产成人精品久久久国产成人一区二区三区综 | 精品国产一区二区三区久久影院 | 久久国产免费一区二区三区| 国产福利电影一区二区三区,亚洲国模精品一区 | 国模大尺度视频一区二区| 好爽毛片一区二区三区四无码三飞| 久久国产一区二区三区| 一区二区三区免费视频播放器| 天堂一区人妻无码| 国产精品成人一区无码| 无码av免费毛片一区二区| 亚洲日本久久一区二区va| 亚洲国产成人精品无码一区二区| 亚洲一区二区三区深夜天堂| 亚洲成AV人片一区二区| 中文字幕亚洲乱码熟女一区二区| 久久综合九九亚洲一区| 国产凹凸在线一区二区| av无码人妻一区二区三区牛牛 | 无码毛片一区二区三区中文字幕| 一区二区三区久久精品| 搜日本一区二区三区免费高清视频| 亚洲a∨无码一区二区| 国产一区二区成人| 无码人妻精品一区二区三区久久久 | 无码av免费毛片一区二区| 国产日本亚洲一区二区三区| 国产精品一区二区不卡| 久久久久久人妻一区二区三区| 少妇一晚三次一区二区三区| 无码少妇一区二区三区芒果| 日韩视频一区二区在线观看| 日本一区二区三区四区视频 | 亚洲一区二区三区在线观看网站| 国产一区二区三区露脸| 无码少妇一区二区三区| 久久久久人妻精品一区二区三区|