BIFS614 Data Structures and Algorithms
K-Means Clustering in R (using R-Studio)
To complete this homework you will first have to download R and R-Studio. They are available at no charge from the following links:
R download: https://cran.cnr.berkeley.edu/
R-Studio download (choose the free desktop version: https://www.rstudio.com/products/rstudio/download2/
Once you have installed R and R-Studio, make sure that you have the corresponding R script (BIFS614 Homework 4.R) on your desktop. Right click on it and open it in RStudio. The interface will open and should look like this:
If you are not familiar with R, the easiest way to execute the script will be one line at a time. Place your cursor at the end of the first line of the script and press the “RUN” button at the top of the upper left window. This will execute ONE LINE at a time. The bottom left window (the Console) will display the output results of the execution, while the upper right window will display any data objects created. The bottom right window will display any visuals (graphs/etc) created.
Step through the script one line at a time by pressing RUN. When you get to the line that loads the Bioconductor source, you should wait and be sure that the package is fully loaded before pressing RUN again. The next line, for biocLite, will also take some time to complete – just be patient.
You should also pay attention to the console output because it may ask you to update a package – if you are asked to update, you should go ahead and say yes. You’ll have to place your cursor in the console window and type the letter it wants, which is a “y” for yes (but without the quotes) or an “a” for all if more than one package needs to be updated.
Be sure to read the script – there are many comments and instructions in there as well.
QUESTIONS TO ANSWER:
1. When you performed hierarchical clustering with the defaults set to 8, what did you see? What does this mean? Include an image as well as a description. (25 pts).
2. Modify the script to perform the hierarchical clustering for a different number of clusters (your choice but values between 5 and 12 are probably the most useful). Include an image of the new clustering – compare it to the original settings. What does this tell you? (50 pts).
3. The Golub et. al. (1999) paper describes this dataset. How does your clustering compare to the results that they found? (25 pts).
We value our customers and so we ensure that what we do is 100% original..
With us you are guaranteed of quality work done by our qualified experts.Your information and everything that you do with us is kept completely confidential.
You have to be 100% sure of the quality of your product to give a money-back guarantee. This describes us perfectly. Make sure that this guarantee is totally transparent.Read more
The Product ordered is guaranteed to be original. Orders are checked by the most advanced anti-plagiarism software in the market to assure that the Product is 100% original. The Company has a zero tolerance policy for plagiarism.Read more
The Free Revision policy is a courtesy service that the Company provides to help ensure Customer’s total satisfaction with the completed Order. To receive free revision the Company requires that the Customer provide the request within fourteen (14) days from the first completion date and within a period of thirty (30) days for dissertations.Read more
The Company is committed to protect the privacy of the Customer and it will never resell or share any of Customer’s personal information, including credit card data, with any third party. All the online transactions are processed through the secure and reliable online payment systems.Read more
By placing an order with us, you agree to the service we provide. We will endear to do all that it takes to deliver a comprehensive paper as per your requirements. We also count on your cooperation to ensure that we deliver on this mandate.Read more