Module talk:zh/data/dial

Latest comment: 18 days ago by Justinrleung in topic "Northeastern Mandarin" categorization

Taz dialect

edit
variety_data["Taz"] = { --extinct?
	group = "Mandarin",
	order = ,
	chinese = "米哈伊洛夫卡(塔茲語)",
	english = "Mikhaylovka (Taz)",
	link = "Taz dialect",
	lat = 43.945306, --[[w:ru:Михайловка (Ольгинский район)]]
	long = 134.811167
}

I'm going to add a 方言點 of Taz dialect spoken in Михайловка (Ольгинский район) (Mixajlovka (Olʹginskij rajon)) (not very sure). The name 塔茲語 seems unused outside the Chinese Wikipedia. According to [1] (PDF page 18), Taz people call this dialect da2zihua4. Maybe we can call it 韃子話? @Justinrleung, RcAlex36 what do you think? -- 14:41, 21 October 2020 (UTC)Reply

@沈澄心: Is there any Chinese resource that uses this term for this dialect in particular? If not, I'm not sure if it's appropriate. — justin(r)leung (t...) | c=› } 15:57, 21 October 2020 (UTC)Reply
@Justinrleung: I haven't seen any reliable Chinese resource that mentions this dialect. Perhaps there's no (general) Chinese name for it. -- 03:45, 22 October 2020 (UTC)Reply
@沈澄心: Hmm, then let's stick with Wikipedia's name for now. — justin(r)leung (t...) | c=› } 03:50, 22 October 2020 (UTC)Reply
why not just omit it 🤷 —Fish bowl (talk) 06:16, 10 April 2022 (UTC)Reply

Guangzhou

edit

@Justinrleung, RcAlex36 I was wondering if we should split Guangzhou into two separate entries, one for 東山口音 and one for 西關口音. What do you guys think? The dog2 (talk) 16:52, 1 November 2020 (UTC)Reply

@The dog2: I think we've talked about this before, and I said we shouldn't because the distinction is no longer there (like people don't speak with on or the other, but probably a mix of features from both). Wikipedia says that the Xiguan dialect isn't really used anymore. Also, I don't think we have resources to split the two dialects. — justin(r)leung (t...) | c=› } 17:09, 1 November 2020 (UTC)Reply

Qiaogang, Beihai

edit

@Justinrleung, RcAlex36 Should Beihai-QG be split into two entries, one for Cát Bà and one for Cô Tô (per 广西北海侨港镇吉婆岛粤方言词汇研究)? -- 05:59, 3 November 2020 (UTC)Reply

@沈澄心: Yeah, I think we could. Let's keep Beihai-QG as Cô Tô since that's what we've been dealing with so far (I think) and add another one for Cát Bà, which we could call Beihai-QG-CB? — justin(r)leung (t...) | c=› } 06:06, 3 November 2020 (UTC)Reply
@Justinrleung, RcAlex36, 沈澄心: Just one thing to ask though. Are there any speakers of these dialects (and Mong Cai Cantonese) left? I thought after the end of the Vietnam War, the ethnic Chinese in the North were mostly expelled, and within Vietnam, only the one in Saigon (Ho Chi Minh City) in the South remains in significant numbers. The dog2 (talk) 06:25, 3 November 2020 (UTC)Reply
@The dog2: Qiaogang - yes, they're Vietnamese Chinese who went back to Guangxi. Mong Cai - they're mostly new-ish residents who come from different parts of Northern Vietnam, and there are around two to three hundred according to 越南芒街市粤方言词汇研究. — justin(r)leung (t...) | c=› } 06:34, 3 November 2020 (UTC)Reply
@Justinrleung: I think that's OK. 越南芒街市粤方言词汇研究 says "本文将越南芒街市钦廉片粤方言的词汇与祖籍地方言——广西防城港市防城镇钦廉片粤方言,以及广西北海市侨港镇钦廉片粤方言(=Cô Tô Cantonese; Cát Bà Cantonese is 广府片)的词汇分别做对比". -- 11:44, 3 November 2020 (UTC)Reply

Southeast Asian Cantonese labels

edit

@Justinrleung, RcAlex36 Just wondering, should we label the Southeast Asian varieties of Cantonese with either 廣府 or 四邑? After all, both varieties exist in Southeast Asia. For instance, Yangon Cantonese is mostly of the 四邑 variety, while Kuala Lumpur, Ho Chi Minh City and Singapore Cantonese are mostly of the 廣府 variety. The dog2 (talk) 01:39, 6 December 2020 (UTC)Reply

@The dog2: Yes, I've gone ahead and added some labels. The Siyi ones are already labelled as Taishan. — justin(r)leung (t...) | c=› } 01:45, 6 December 2020 (UTC)Reply

Classification of 平南平田村閩語

edit

@Justinrleung, RcAlex36 is 平田村閩語 Puxian Min? 广西平南平田村闽语研究 (published in 2011) classify it as Min Nan, but this thesis also says 据发音人介绍,平田村人的祖先来自福建莆田,从莆田迁出应有三四百年以上的历史了。中途经广东连滩,迁入广西平南后已繁衍了十四代人——发音人74岁,是家族入广西后的第十二代人。 -- 14:54, 18 March 2021 (UTC)Reply

@沈澄心: Looks like Min Nan to me, but I'm not sure. RcAlex36 (talk) 15:41, 18 March 2021 (UTC)Reply
@沈澄心: It looks more Min Nan than Puxian, I think, e.g. it doesn't seem to have the characteristic lenition /β/ that Puxian Min has. And it's probably safer to follow what the author said. — justin(r)leung (t...) | c=› } 20:11, 18 March 2021 (UTC)Reply

Kuala Lumpur Hakka

edit

@Justinrleung, RcAlex36, The dog2 Which variety of Hakka does "Kuala Lumpur-H" represent? -- 14:57, 12 April 2021 (UTC)Reply

@沈澄心: Dabu. It's recorded in 马来西亚吉隆坡大埔客家话词汇研究. RcAlex36 (talk) 15:02, 12 April 2021 (UTC)Reply

Deqing

edit

@沈澄心, Justinrleung There are two Deqings in the code. Should we change the Wu one to Deqing-W? RcAlex36 (talk) 03:22, 14 April 2021 (UTC)Reply

@RcAlex36: Yes, it should be Deqing-W. -- 07:52, 14 April 2021 (UTC)Reply

Szeyap/Siyi Cantonese

edit

@RcAlex36, Justinrleung Just wondering, shouldn't we put Taishan at the top of the list for the Szeyap variety of Cantonese? To my knowledge, the Taishan dialect is considered to be the prestige dialect of this branch of Cantonese.

I would personally be inclined to group the various Cantonese dialect based on branch, in the much the same way that we group the Min Nan dialect; under Minnan, we group all the Hokkien dialects together, all the Teochew dialects together, and all the Hainanese dialects together. So for Cantonese, I'd be inclined to group them into 廣府片, 四邑片, 高陽片, 吳化片 and so on. The dog2 (talk) 16:35, 29 April 2021 (UTC)Reply

@The dog2: We're going with the order in 珠江三角洲方言詞彙對照 because it's our main source on these dialects. Taishan may be considered the representative dialect outside of Mainland China, but I'm not sure about its status in China now. — justin(r)leung (t...) | c=› } 17:03, 29 April 2021 (UTC)Reply

Shanghai

edit

@沈澄心, I see in a recent edit that you've put a comment on Shanghai. I think we've kind of been using both 上海方言詞典, 上海话大词典 and 上海市区方言志 (as well as other sources) for this, so it's not just 老派 for sure. 上海话大词典 and 上海市区方言志 are 中派 (previously called 新派). I don't think we should necessarily make new fields for different age groups (派). — justin(r)leung (t...) | c=› } 03:01, 9 June 2021 (UTC)Reply

Dunhuang

edit

@Justinrleung, RcAlex36, 沈澄心: Should this be split into two entries, one for 河東話 and one for 河西話 (per 中国语言地图集)? --TheDarkKnightLi(STAY HAPPY) 02:17, 30 August 2021 (UTC)Reply

@Thedarkknightli: The only source I'm using for Dunhuang is 普通话基础方言基本词汇集, which records 河東話. — justin(r)leung (t...) | c=› } 02:27, 30 August 2021 (UTC)Reply

Raoping & Puning Teochew

edit

@Justinrleung: what sources are you using for these? -- 04:46, 21 January 2022 (UTC)Reply

@沈澄心: Hi, for Raoping, it's 广东省饶平方言记音 (1993) and 饶平县志 (1994); I think the two sources are almost the same if not identical. For Puning, I'm using 普宁县志 (1995). We also have an editor from Puning, Austin Zhang. — justin(r)leung (t...) | c=› } 05:47, 21 January 2022 (UTC)Reply

Taiwanese Hakka

edit

@Justinrleung, RcAlex36 Hi! What are the exact locations of six 方言點 in 臺灣客家語常用詞辭典? And which variety of Hakka does "Taoyuan" represent? -- 09:17, 31 January 2022 (UTC)Reply

Some theses about Taiwanese Hakka:

  • 永定新舊移民之客家話比較——以楊梅鎮秀才窩與蘆竹鄉羊稠村為例:桃園市楊梅區秀才里、蘆竹區羊稠里(永定)
  • 台灣平遠客家話研究——以湖口鄉賴屋庄為例:新竹縣湖口鄉中勢村賴屋庄(平遠)
  • 高樹大路關與內埔客家話比較研究:屏東縣高樹鄉大路關廣福村、內埔鄉(四縣)
  • 臺灣海四話研究——以苗栗縣西湖鄉為例:苗栗縣西湖鄉(海四)
  • 新竹饒平客語詞彙研究:新竹縣竹北市六家區林屋、關西鎮鄭屋、芎林鄉紙寮窝劉屋、湖口鄉周屋(饒平)
  • 花蓮地區客語阿美語接觸研究:花蓮縣(阿美族)
  • 苗栗縣泰安鄉族語式客家話研究:苗栗縣泰安鄉(四縣 - 客家、泰雅族)
  • 桃園楊梅地區四海與海四客語語言接觸研究:桃園市楊梅區(四海、海四)
  • 兩岸饒邑王氏家族客家話的研究:桃園市新坡(饒平)、廣東饒平新豐鎮
  • 關西湖肚饒平客話研究:新竹縣關西鎮東山里湖肚周屋(饒平)
  • 關西饒平客家話調查研究-以鄭屋、許屋為例:新竹縣關西鎮鄭屋、許屋(饒平)
  • 新屋海陸客家話詞彙研究:桃園市新屋區(海陸)
  • 新屋鄉呂屋豐順腔客話研究:桃園市新屋區呂屋(豐順)
  • 台灣漳州客家話的研究——以詔安話為代表:雲林縣崙背鄉(詔安)
  • 台灣桃園饒平客話研究:桃園市中壢區芝芭里(饒平)
  • 屏東縣九如鄉玉泉村暨鹽埔鄉洛陽村之客家話研究
  • 台灣五華(長樂)客家話研究:桃園市中壢區仁美里、新屋區槺榔里、觀音區坑尾里、觀音區金湖里、楊梅區瑞原里、楊梅區永寧里、中壢區普義里,苗栗縣通霄鎮烏眉里(五華)
  • 苗栗卓蘭客家話研究:苗栗縣公館鄉(四縣)、卓蘭鎮(四縣、饒平)
  • 屏東高樹鄉大路關廣福村客家話研究
  • 屏東佳冬客話研究:屏東縣佳冬鄉
  • 國姓鄉1948年來臺之陸豐客話研究:南投縣國姓鄉(陸豐雲嶺、下砂,今屬揭西)
  • 兩岸大埔客家話研究:台中市東勢區(大埔)
  • 臺灣苗栗通霄客話研究:苗栗縣通霄鎮烏眉里、楓樹里、福龍里、城北里、南和里、福興里
  • 屏東地區閩客雙方言接觸現象——以保力、武洛及大埔為例:屏東縣里港鄉茄苳村武洛社區、南州鄉萬華村大埔社區(海四、四海)、車城鄉保力村閩南語
  • 臺灣桃園地區詔安客家話之研究:桃園市中壢區三座屋、大溪區南興
  • 桃園縣觀音鄉白玉村閩式客家話之研究
  • 苗栗海線客家話之語言混用現象研究:苗栗縣後龍鎮、通霄鎮、苑裡鎮(四縣、海陸)
  • 中寮鄉客家話的語言接觸現象:南投縣中寮鄉(四縣、海陸)
  • 台灣四縣海陸客家話比較研究:新竹縣芎林鄉上山村(海陸)、苗栗(四縣,引自《客話辭典》《客語字音詞典》)
  • 桃園官路缺袁姓饒平客家話研究:桃園市八德區官路缺袁氏(饒平)
  • 桃園高家豐順客話音韻研究:桃園市觀音區藍埔里高家(豐順)
  • 賽夏族客家話使用現況研究——以南庄鄉東河村為例:苗栗縣南庄鄉東河村(賽夏族)
  • 關西六曲窩海陸客話音韻研究
  • 嘉義縣方言志:嘉義縣中埔鄉(四海、海四)
  • 花蓮玉里四海客家話研究

-- 09:17, 31 January 2022 (UTC)Reply

@沈澄心: It's not entirely clear to me which locations they are exactly, and I don't think they are actually specific. The new 新編客家語六腔辭典 tells us that it's these points:
  • 四縣腔:苗栗
  • 海陸腔:新竹竹東
  • 大埔腔:臺中東勢
  • 饒平腔:新竹芎林
  • 詔安腔:雲林崙背
  • 南四縣腔:屏東內埔
I think we can assume that this holds for 臺灣客家語常用詞辭典 as well. As for the Hakka Affairs sources (which have been down for a few months), I think they are about the same except that there may be coverage of some additional points, particularly for Raoping (probably Zhuolan and something else), but it doesn't seem clearly indicated.
As for Taoyuan, I really don't know what dialect exactly it is. The only source that I've been using for that parameter is 桃园话音档. — justin(r)leung (t...) | c=› } 09:50, 31 January 2022 (UTC)Reply
@Justinrleung: Maybe "Taoyuan" (Taoyuan City, not non-Hakka speaking Taoyuan District) can be deprecated? It seems too ambiguous. 09:39, 3 February 2022 (UTC)Reply
@沈澄心: Hmm, maybe, but we still have a lot of pronunciation data from 桃园话音档 in the dialectal pronunciation modules. — justin(r)leung (t...) | c=› } 16:49, 3 February 2022 (UTC)Reply

Raoping-XF

edit

@Justinrleung What's your source for Raoping (Xinfeng) Hakka? -- 05:19, 22 March 2022 (UTC)Reply

@沈澄心: 饒平上饒客家話語言特點說略 (1992), 广东省饶平方言记音 (1993), 饶平县志 (1994), 广东饶平上饶客家话的两字连读变调 (2004). — justin(r)leung (t...) | c=› } 13:33, 22 March 2022 (UTC)Reply

Changing Manjung (Gutian) to Sitiawan (Gutian)

edit

There was some discussion about this on my talk page with @The dog2 and @Justinrleung last year, but there wasn't any outcomes.

If no one disagrees, I'd like to have "Manjung-MD-GT" changed to "Sitiawan-MD-GT". I'll make some changes in this module first and then request for a bot from the Grease pit to search and replace all "Manjung-MD-GT" in the module namespace with "Sitiawan-MD-GT". Long story short, I believe "Manjung" is a misnomer and "Sitiawan" would be more accurate. Lousysofa (formerly NameName233) (talk) 09:59, 11 April 2023 (UTC)Reply

@Lousysofa: Sounds good. Perhaps @Fish bowl might be able to help with the bot request? — justin(r)leung (t...) | c=› } 21:43, 11 April 2023 (UTC)Reply
  DoneFish bowl (talk) 23:11, 12 April 2023 (UTC)Reply

Changing the naming conventions

edit

In Module:zh/data/dial/documentation#Naming conventions, it states that All levels below the county level should be specified in brackets only when needed (i.e. when the location in question is not the county-/prefecture-level seat).

However there are certain occasions where it will be confusing without specifying the town-level location when it is the county seat, due to PRC's policy of mucking up place names left right and centre how county seats can be moved on a whim, and how mergers of multiple counties are sometimes handled by having the county seat at county A but naming the new county with the name of county B. For example, Wuchuan (present day county seat is at 梅菉街道) is a merger of Wuchuan county 吳川縣 and Meimao (?) county 梅茂縣, whose county seat are at 吳陽鎮 and 梅菉街道 respectively, while their dialects are usually refered to as 吳川話 and 梅菉白話 respectively. I'm pretty certain there are similar situations out there, but history of administrative divisions isn't really in my wheelhouse (yet).

Anyways, following the convention strictly (in the previous example, we would have 吳川話 as "Wuchuan (Wuyang)" and 梅菉白話 as "Wuchuan") will likely lead to confusion for both readers and editors, and may not be descriptive. I suggest that we change this line to Levels below the county level should be specified in brackets (e.g. when the location in question is not the county-/prefecture-level seat)., and exercise our own editorial judgement when the location in question is the county seat.

@Justinrleung, 沈澄心wpi (talk) 18:40, 26 October 2024 (UTC)Reply

@wpi: Agreed. — justin(r)leung (t...) | c=› } 22:50, 26 October 2024 (UTC)Reply
  Support dringsim 10:02, 27 October 2024 (UTC)Reply

"Northeastern Mandarin" categorization

edit

Should the "Northeastern Mandarin" variety be renamed or split to reflect all the locations it encompasses? Currently the Beijing dialect, Taiwanese Mandarin, Malaysian Mandarin, Singaporean Mandarin, etc. are all categorized under "Northeastern Mandarin" which seems wrong.

At least based on Wikipedia, these are all sorted under Beijing Mandarin (北京官話) and not Northeastern Mandarin (東北官話). There does seem to be dispute as to whether these are separate groups, but when they are combined, the overall group seems to be referred to as 北京官話 (Beijing Mandarin) and not 東北官話 (Northeastern Mandarin) as it seems this module is doing.

The Wiktionary Regional Mandarin categories also split these dialects into Northeastern, Beijingic, and non-mainland Mandarin instead of combining them into one.

@Justinrleung, 沈澄心Kalexchu (talk) 02:31, 28 October 2024 (UTC)Reply

Also pinging @ND381, kc_kennylau. I don't have strong opinions, but I do agree that people generally would have pushback with our current situation. The main problem with splitting Beijing(ic) and Northeastern, if I'm not mistaken, is that Beijing(ic) is not really definable based on cladistic classification, so some of us have decided to merge Beijing(ic) and Northeastern. — justin(r)leung (t...) | c=› } 03:18, 28 October 2024 (UTC)Reply