So my question is - when does it become appropriate to transform the data?
If you believe Dr. Wheeler and Dr. Shewhart - NEVER. Keep the data as they are, easier to interpret and harder to inadvertently distort the data.
Of course, it is hard to say NEVER. Even Dr. Wheeler has a technique for analyzing infrequent events, where rathert than counting the number of events per time interval, you plot the quantity 365/days between events. By plotting this rate, you are actually doing close to an exponential transformation, which would be appropriate for time between events data.
My current plan is to chart the data and derermine if the process is stable.
Good, always an appropriate first step.
If it is then next I will need to generate a cp and cpk values of my process to identify if process is capable?
I'm not a big fan of those values - one should be able to look at the control chart (especially the limits) and compare them to specification and customer feedback to see if improvement is needed. A Cp of 2 is NOT a "magic number". Boeing goes to a Cp of 3 (nine sigma) on some critical airplane components.
If my data is not normal - at this point do I run the AD test and determine if my P value is greater than 0.05 OR should I run all available transformations of my data find the highest P value and use that transformed data to calculate my capability index?
My opinion is that the gain would not be worth the pain.
Does it make sense to apply the Central Limiting Therom to make my data more normal for control purposes - or does this again distort the voice of the process?
I think in asking the question you already know the answer

Yes, it would be my opinion that it distorts the voice of the process. Of course, you may then ask about xBar-R control charting. The use of the xbar is NOT to invoke CLT, but to determine if you have within group versus outside of group variation.
By the way, my parents and brother live in the Cleveland area (Macedonia and Twinsburg, respectively), so I make it up your direction on occasion.