Article content
1. Introduction
The rapid development of short-video platforms has significantly altered online consumer behavior, particularly in the context of content-based e-commerce. In Vietnam, short video platforms have achieved a high penetration rate, reaching approximately 79 million active social media accounts by the end of 2025 (Kemp, 2025). Specifically, TikTok and YouTube Shorts account for 40.9 million and 62.1 million users respectively, highlighting the central role of short videos in consumers’ digital lives. Studies show that approximately 82% of users are influenced to make a purchase after being exposed to this content (Kantar Vietnam, 2023). Short videos can stimulate immediate emotions and significantly shorten the pre-purchase consideration time to as little as 15 seconds to 5 minutes of viewing (Luo et al., 2025). As a result, they are considered an important driver of impulse buying behavior in the e-commerce environment (Yan, 2019).
In this context, online impulse buying is becoming increasingly common. Impulse buying is defined as the act of purchasing goods without prior planning (Bellini & Aiolfi, 2019). Previous studies show that this behavior can account for approximately 40-80% of total purchasing decisions, depending on the product group (Aragoncillo & Orus, 2018), thereby emphasizing the significant role of impulse buying in consumer behavior. In Vietnam, several studies have examined impulse buying behavior in the context of e-commerce and livestreaming (Quyen, 2024), clarifying the role of internal, external, and flow experience factors within the S-O-R model framework (Phung, 2024). However, although short videos have been shown to influence purchasing decisions on platforms such as TikTok (Nguyen et al., 2025), the underlying mechanisms of impulse buying behavior formation in short-video contexts remain fragmented, lacking a clear explanation of how emotional and cognitive mechanisms interact and function together within a unified framework, particularly in short-video commerce contexts.
While prior studies have examined emotional and cognitive mechanisms simultaneously, they have not systematically explained how these mechanisms interact or differ in their roles within the context of short-video commerce. According to the S-O-R model, environmental stimuli do not directly influence behavior but rather through individuals' internal states (Hochreiter et al., 2023). Arousal and pleasure represent consumers’ immediate emotional responses to environmental stimuli, whereas trust reflects their cognitive evaluation of shopping objects. Despite their different natures, both emotional responses and cognitive evaluations function as internal organism states within the S-O-R framework, jointly mediating the relationship between external stimuli and behavioral outcomes. Although recent studies suggest that arousal, pleasure, and trust may simultaneously act as mediating mechanisms (Wen et al., 2025), existing research has not clearly explained their relative roles or how these mechanisms interact within a unified framework, particularly in short-video commerce contexts. Therefore, integrating these mechanisms as parallel mediators provides a more comprehensive explanation of how short-video characteristics influence impulse buying behavior. This approach allows a more comprehensive understanding of how short video characteristics influence impulse buying behavior through both emotional responses and cognitive evaluations.
This study was conducted to analyze the influence of short video characteristics, including visual appeal, social influence, entertainment, and informativeness on impulse buying behavior, while elucidating the mediating roles of arousal, pleasure, and trust in the Vietnamese market context. This study contributes to the literature by clarifying the integrated roles of emotional and cognitive mechanisms, while also offering managerial implications for optimizing short-video content strategies.
2. Theoretical backround and research hypotheses
According to Kauldhar (2024), short-form video is a digital content format with a short duration (typically 15 to 60 seconds), optimized for mobile devices and social media environments, characterized by fast pacing, high conciseness, the ability to stimulate immediate emotions, and a high level of interaction. Short-form video allows for the simultaneous integration of entertainment, information, and social interaction within a single viewing experience, thereby blurring the lines between content consumption and purchasing behavior in the online environment (Kauldhar, 2024). To attract attention, short videos are often designed with high information density, rapid scene changes, on-screen text, and a fast pace (Tess et al., 2024). These emotionally rich stimuli in the digital environment can maintain a high state of arousal, impair cognitive control, and promote impulse buying behavior (Shen & Khalifa, 2012; Verhagen & Van Dolen, 2011; Xin et al., 2025).
Impulse buying behavior means that consumers have no prior plan or intention to buy, and they immediately make a purchase based solely on sudden or momentary thoughts (Chan et al., 2016). The nature of impulse buying is that consumers form cognitive and emotional responses during the purchasing process (Chang et al., 2012). In the context of short video usage, consumers perceive that these short content pieces stimulate emotions and perceived value, thereby increasing their buying intention (Thao & Hai, 2024). When consumers have a positive experience, they develop positive emotions. Consequently, they may overestimate their economic capabilities and personal needs, thereby increasing the likelihood of engaging in impulse buying behavior (Liu et al., 2020).
Theoretical Framework
This study adopts the Stimulus-Organism-Response (S-O-R) framework as a theoretical foundation to explain how environmental stimuli are transformed into impulse buying behavior. The S-O-R model posits that external stimuli (Stimulus) influence individuals’ internal states (Organism), which in turn lead to behavioral responses (Response). However, the model does not specify the exact nature of the “Organism” component. In contrast, the Pleasure-Arousal-Dominance (PAD) model proposed by Mehrabian and Russell (1974) provides a clearer explanation of internal emotional states through three dimensions: pleasure - arousal - and dominance. However, in impulse buying, dominance is often less emphasized than pleasure and arousal, as it reflects a sense of control, which is not prominent in spontaneous and impulsive decisions (Iyer et al., 2019). Empirical evidence also suggests that the role of dominance is not significant in fast-paced online consumption environments (Lamis et al., 2022). Accordingly, prior studies have primarily focused on pleasure and arousal as key emotional states in explaining impulse buying behavior (Leng et al., 2024; Ngo et al., 2024).
In short-video contexts, emotional stimuli are triggered almost instantly when users are exposed to content, which not only generates emotional responses but also contributes to the formation of trust through emotional and social cues (Škripcová & Viteková, 2025). Recent studies further indicate that arousal, pleasure, and trust can simultaneously act as mediating mechanisms in explaining impulse buying behavior in content-based e-commerce environments (Wen et al., 2025). In such environments, where information is condensed and decisions are made rapidly, consumers face a high level of uncertainty. Trust, therefore, plays a critical role in reducing perceived risk and enabling emotional responses to translate into actual purchase behavior. Accordingly, trust should be considered a cognitive state that operates alongside emotional states within the Organism component. Based on these arguments, this study conceptualizes the Organism component as a combination of emotional states (arousal and pleasure) and a cognitive state (trust). Accordingly, these variables are positioned as parallel mediating mechanisms within the S-O-R framework to explain impulse buying behavior in short-video commerce.
Hypotheses and research model
The impact of visual appeal on arousal, pleasure, and trust
Visual appeal refers to viewers' perceptions of the attractiveness, pleasantness, and clarity of images, colors, and layout in short videos (Iswanto, 2025). Previous studies have demonstrated that visual appeal elements such as color, layout, and movement can trigger immediate psychological responses (Kathuria & Bakshi, 2024). Within the S-O-R theory framework, Liu (2019) also indicates that the content presented in short videos can lead consumers to form more positive perceptions of the product. Studies in the field of visual perception show that visual appeal increases arousal and pleasure. In the study by Ngo et al. (2024), visual appeal was also shown to impact consumers' arousal and pleasure when using video platforms. Highly salient images, especially those with emotional connotations, have the potential to strongly activate emotional arousal states, including arousal and pleasure, attracting initial attention and prolonging viewers' observation time (Humphrey et al., 2012; Niu et al., 2012). Supplementing this theory, Bui (2025) confirms that the visual appeal of the first few seconds of a TikTok video can evoke a strong level of emotional arousal in users. When aesthetic arousal is satisfied, users fall into a positive psychological state, significantly reducing rational control. Beyond emotional responses, visual appeal also influences consumer trust. Previously, Tarayra et al. (2022) emphasized that sensory marketing increases trust by creating positive and professional perceptions. In the online environment, Wijaya and Kuswoyo (2022) pointed out that visual appeal positively influences consumer trust, thereby affecting online purchasing decisions. Further supporting this argument, Winarti (2023) confirmed that beautiful, vivid images have a positive and statistically significant impact on customer trust on social media. Based on the above, the following hypotheses are proposed:
H1: Visual Appeal has a positive effect on consumer arousal.
H2: Visual Appeal has a positive effect on consumer pleasure.
H3: Visual Appeal has a positive effect on consumer trust.
The impact of social influence on arousal, pleasure, and trust
Social influence is defined as changes in an individual's thoughts, emotions, attitudes, or behaviors due to interaction with another individual or group (Liang et al. 2016; Monteserin & Amandi, 2015). When consumers are exposed to social cues such as endorsements, positive reviews, or the bandwagon effect, these stimuli can alter their emotional state along the dimensions of the PAD model, including arousal and pleasure. In the context of online shopping, social influence reflects how individuals' social identities within an online community (such as friends, acquaintances, followers, or consumer communities) shape their trust, evaluation processes, and purchasing behaviors (Liang et al., 2016). Research by Ngo et al. (2024) shows that as social influence increases, pleasure and arousal also increase significantly. Studies indicate that positive social signals, such as endorsements or the “bandwagon effect”, often evoke strong emotional responses, including arousal and pleasure (Akram et al., 2018). Endorsements from friends and influencers also have a strong impact on purchasing decisions, as social presence and interactive exchanges on online platforms enhance engagement and encourage impulse buying (Zhang et al., 2022). According to Chung and Cho (2017), consumer trust is significantly influenced by the reliability of social information sources, especially when information is conveyed by influential individuals in online communities. Subsequent studies show that consistent and professional information from social influencers helps reinforce consumer trust (Chen & Yang, 2023). A report by Influencer Marketing Hub (2023) also notes that marketing campaigns involving influencers achieve higher average trust levels than campaigns without influencers, reflecting the role of social influence in shaping trust. Based on these foundations, this paper proposes the following hypotheses:
H4: Social influence has a positive effect on consumer arousal.
H5: Social influence has a positive effect on consumer pleasure.
H6: Social influence has a positive effect on consumer trust.
The impact of entertainment on arousal, pleasure, and trust
Entertainment reflects the degree to which content brings feelings of joy and pleasure to consumers during online shopping, and within the S-O-R theory framework, this factor is considered an external stimulus affecting their emotional responses (Ali et al., 2014). Tess et al. (2024) also showed that users tend to access short videos frequently primarily for entertainment purposes, reflecting the increasingly prominent role of short videos in meeting the demand for relaxation and quick content consumption. From the PAD theory perspective, entertaining and emotionally rich video content has the potential to simultaneously increase arousal and pleasure, thereby promoting impulse buying behavior (Hoyer et al., 2020). Adding to this argument, Phuoc Nguyen (2025) explains that short videos create high emotional arousal through the entertainment of their content. Xiao et al. (2026) also studied how the entertainment of livestreaming positively impacts consumer pleasure. Moreover, the entertainment factor can also disrupt initial shopping plans by altering emotional states, thereby stimulating impulse buying behavior (Zhang et al., 2019). Beyond emotional responses, the entertainment of short videos also contributes to consumer trust. Huang et al. (2022) indicate that entertaining video content can evoke pleasant emotions, thereby increasing willingness to interact and build trust. Orús et al. (2016) suggest that lively and entertaining content helps reduce consumers’ skepticism towards marketing messages, thereby reinforcing their initial trust in the brand. From the perspective of emotional mechanisms and interaction, subsequent studies show that presenting creative and entertaining videos can increase consumer engagement, thereby contributing to a deeper trust in the brand in the digital environment (Teixeira et al., 2011; Hollebeek & Macky, 2018). Based on these foundations, this paper proposes the following hypotheses:
H7: Entertainment has a positive effect on consumer arousal.
H8: Entertainment has a positive effect on consumer pleasure.
H9: Entertainment has a positive effect on consumer trust.
The impact of informativeness on arousal, pleasure, and trust
Pavlou and Fygenson (2006) define informativeness as consumers’ trust that the information provided enhances efficiency in the process of collecting and evaluating product information. In the online shopping environment, informativeness is considered a feature of the presentation content, i.e., an external environmental factor (Stimulus) that can guide consumers' information processing, thereby influencing emotional responses during online shopping (Deng & Gu, 2020). Short videos are considered advantageous in conveying product information compared to other media (Qin et al., 2024). When the content provides clear, comprehensive, and relevant information, consumers tend to increase their attention and interest in the product (Qin et al., 2024; Zhang et. al., 2020). According to the mechanism of the SOR model, this content characteristic acts as a stimulus from the environment, affecting the individual's internal psychological state. The process of receiving and evaluating positive information changes the "Organism" component, and from the perspective of the PAD model, this change is expressed through arousal and pleasure in the online shopping experience (Khachatryan et al., 2018), thereby laying the foundation for subsequent behavioral responses.
Beyond emotional responses, informativeness also plays a crucial role in forming consumer trust. Han (2014) shows that platforms providing comprehensive and clear product information help reduce ambiguity and increase consumers' intrinsic trust. Studies in the context of online commerce and live commerce also confirm that informativeness positively influences consumer trust (Lee et al., 2007). These results highlight the crucial role of informativeness in building consumer trust, particularly in content-based e-commerce environments like short videos. Based on these foundations, this paper proposes the following hypotheses:
H10: Informativeness has a positive effect on consumer arousal.
H11: Informativeness has a positive effect on consumer pleasure.
H12: Informativeness has a positive effect on consumer trust.
The impact of arousal, pleasure, and trust on impulse buying behavior
Arousal reflects the level of arousal and emotional activation of consumers during the shopping process (Mehrabian & Russell, 1974). When integrated into the S-O-R framework, arousal is considered a state belonging to “Organism”, the individual’s internal response to environmental influences (Mehrabian & Russell, 1974). Under this mechanism, high levels of arousal can impair cognitive control, leading to impulsive behavioral responses (Response) (Ngo et al., 2025). A recent experiment by Ngo et al. (2024) also clearly demonstrated the mediating role of arousal (AR) and pleasure (PL) in significantly promoting impulse buying behavior. Numerous studies indicate that as arousal levels increase, consumers’ cognitive control tends to decrease, making them more prone to impulsive buying decisions (Beatty & Ferrell, 1998). In online shopping environments, especially those rich in emotional arousal, consumers often react based on immediate emotional states rather than rational consideration, thereby increasing impulse buying behavior (Ning Shen & Khalifa, 2012; Serfas et al., 2014). Pleasure is also a core component of the PAD model, representing the positive and pleasant emotional state that consumers experience during shopping (Mehrabian & Russell, 1974). Combined with the S-O-R perspective, pleasure belongs to the “Organism” component, playing a role in transforming environmental influences into behavioral responses (Nagano et al., 2023). When experiencing high levels of pleasure, the shopping experience becomes more appealing, thereby increasing the tendency to make impulse buying behavior(Lee & Yi, 2008; Mishra et al., 2014). Previous studies have also shown that pleasure plays an important role in promoting impulse buying behavior, as consumers tend to make purchases to maintain and prolong this positive emotional state (Kacen & Lee, 2002; Ning Shen & Khalifa, 2012). Trust is understood as a psychological state reflecting consumers' expectations that sellers or platforms will behave reliably (Morgan et al., 1996). In the online commerce environment, trust is an internal factor (Organism) that plays an important role in reducing risk perception and promoting faster decision-making, thereby facilitating impulse buying behaviors (Styvén et al., 2017; Moreno et al., 2022). Recent studies show that as consumers’ trust levels increase, they tend to be less hesitant and more susceptible to emotional and social stimuli during the shopping process, thereby increasing the likelihood of impulse buying behavior (Han, 2023; Bao & Yang, 2022). Based on these foundations, this paper proposes the following hypotheses:
H13: Arousal has a positive effect on consumers' impulse buying behavior.
H14: Pleasure has a positive effect on consumers' impulse buying behavior.
H15: Trust has a positive effect on consumers' impulse buying behavior.
Based on the above hypotheses, the research model is as follows:
3. Data and research methods
3.1. Research scales
The study uses a 5-point Likert scale, ranging from “1 - Strongly disagree” to “5 - Strongly agree”. The original scales were developed in English and translated into Vietnamese using a two-way translation process to ensure content equivalence. Some observed variables were adjusted to fit the context of short videos in online shopping without changing the meaning of the measured concepts. Specifically, the Social Influence Scale (SI) is derived from the research of Ngo et al. (2024), reflecting the degree to which community and social interactions influence consumers' purchasing decisions. The Visual Appeal (VI) scale is derived from Liu et al. (2020) to measure the attractiveness and aesthetic appeal of short video content. The Entertainment (EN) scale is derived from the study by Oh et al. (2007) to reflect the level of interest and ability to evoke positive emotions in viewers of short videos. The Informativeness Scale (IN) is adapted from Ducoffe (1996), measuring the completeness, clarity, and usefulness of product information conveyed through short videos.
In addition, the Arousal Scale (AR) is derived from Ngo's (2024) research, which aims to assess the level of emotional arousal of viewers when receiving short video content. The Pleasure Scale (PL) is adapted from the studies of Huang et al. (2017), reflecting consumers' feelings of joy and comfort during the shopping process via short videos. Furthermore, the trust scale (TR) is adapted from Zhang et al. (2021), measuring consumers' level of trust in the information and products being introduced. Finally, the Impulse buying behavior (OIB) scale is derived from Mehrabian and Russell (1974), reflecting consumers' tendency toward spontaneous and unplanned shopping. (Appendix 1).
3.2. Data collection method
The study used a convenience sampling method for consumers in Vietnam who had previously made purchases through short videos on social media platforms. The sample was collected through direct and online surveys via Google Form during the period from December 2025 to February 2026. The study applied screening criteria to survey subjects, who were individuals who frequently accessed short videos and had shopping experience through this format; at the same time, data was collected from multiple channels (direct and online) to increase sample diversity. Questions testing shopping experience were used to eliminate unsuitable subjects, thereby limiting sampling bias.
According to the sample size formula by Joskow and Yamane (1965), the minimum sample size required is 385. A total of 401 responses were collected. After screening and removing 9 invalid responses, 392 valid samples were used for analysis. This sample size meets the reliability requirements for subsequent analysis steps. The study population included consumers who had purchased goods through short videos in Vietnam and who had the habit of accessing and exploiting product information through short video platforms. The characteristics of the survey sample focused on individuals who frequently interacted with short video content on social networks (such as TikTok, Facebook Reels, YouTube Shorts). Selecting this sample group ensures high compatibility with the research objective of exploring the characteristics of short videos on impulse buying behavior, helping the research results to be practical and objective.
3.3. Data analysis method
The study was conducted using quantitative methods through the Partial Least Squares Structural Equation Modeling (PLS-SEM) approach. This is a multivariate analysis tool used to evaluate path models with latent structures (Hair et al., 2019). This method is appropriate because the research model includes multiple latent variables and simultaneous relationships between influencing factors and impulse buying behavior. The first step is to evaluate the measurement model to test the reliability of the research constructs (Hair et al., 2019) and discriminant validity (Henseler & Sarstedt, 2013). The next step is to evaluate the structural model to test the relationships between the structures and the research hypotheses. In addition to evaluating the measurement model and structural model, an analysis is also conducted to test the mediating effects of Organism variables (arousal, pleasure, trust) in the relationship between short video characteristics and impulse buying behavior.
4. Results and discussion
4.1. Descriptive statistics of the study sample
Of the total 401 survey forms collected, 392 valid samples were included in the analysis. Descriptive statistics show that the study sample had a female respondent rate of 77.6%, with the majority belonging to the 26-35 age group (31.6%) and the 18-25 age group (27.8%). Regarding educational attainment, the group with vocational or secondary school diplomas had the highest proportion (36.7%), while the group with monthly incomes of 15–30 million VND dominated with 38.8%, reflecting the characteristics of the target group with high access to short videos and aligning with the study objectives.
4.2. Evaluation of the measurement model
The analysis results in Appendix 2 show that all reliability coefficient (CA) values are significant (> 0.7) and the composite reliability (CR) is > 0.7. Additionally, all AVE indices are > 0.5, so according to Hair et al. (2017), it can be concluded that the measurement model achieves internal consistency and also meets the level of convergent validity.
Table 1. Fornell-Larcker Index Values
| AR | EN | IN | OIB | PL | SI | TR | VI |
AR | 0.804 |
|
|
|
|
|
|
|
EN | 0.259 | 0.798 |
|
|
|
|
|
|
IN | 0.430 | 0.430 | 0.864 |
|
|
|
|
|
OIB | 0.223 | 0.158 | 0.388 | 0.938 |
|
|
|
|
PL | 0.249 | 0.241 | 0.400 | 0.249 | 0.838 |
|
|
|
SI | -0.010 | -0.535 | -0.128 | -0.030 | -0.058 | 0.830 |
|
|
TR | 0.319 | 0.250 | 0.423 | 0.267 | 0.292 | 0.006 | 0.841 |
|
VI | 0.180 | -0.089 | 0.266 | 0.081 | 0.182 | -0.032 | 0.222 | 0.799 |
According to Fornell and Larcker (1981), discriminant validity is established when the square root of the AVE of each construct exceeds its correlations with other constructs. As shown in Table 1, all diagonal values (i.e., the square roots of AVE) are greater than the corresponding inter-construct correlations, indicating satisfactory discriminant validity.
Table 2 . Results of the discriminant validity test using the HTMT index
AR | EN | IN | OIB | PL | SI | TR | VI | |
AR | ||||||||
EN | 0.309 | |||||||
IN | 0.519 | 0.519 | ||||||
OIB | 0.251 | 0.178 | 0.434 | |||||
PL | 0.298 | 0.287 | 0.474 | 0.272 | ||||
SI | 0.038 | 0.693 | 0.164 | 0.038 | 0.087 | |||
TR | 0.378 | 0.290 | 0.499 | 0.292 | 0.337 | 0.060 | ||
VI | 0.212 | 0.141 | 0.362 | 0.084 | 0.201 | 0.091 | 0.245 |
Table 2 presents the results of the discriminant validity assessment using the Heterotrait–Monotrait (HTMT) ratio, as recommended by Hair et al. (2017). All HTMT values are below the conservative threshold of 0.85, with the highest value being 0.693 (between SI and EN), further confirming that the constructs are empirically distinct and do not exhibit problematic overlap.
4.3. Structural Model Evaluation
As presented in Table 3, the VIF values for all constructs range from 1.126 to 1.863, which are well below the threshold of 3 (Hair et al., 2017), indicating that multicollinearity is not a concern in this model.
Table 3. Multicollinearity Test (VIF)
| AR | EN | IN | OIB | PL | SI | TR | VI |
AR |
|
|
| 1,147 |
|
|
|
|
EN | 1,863 |
|
|
| 1,863 |
| 1,863 |
|
IN | 1,431 |
|
|
| 1,431 |
| 1,431 |
|
OIB |
|
|
|
|
|
|
|
|
PL |
|
|
| 1,126 |
|
|
|
|
SI | 1,459 |
|
|
| 1,459 |
| 1,459 |
|
TR |
|
|
| 1,176 |
|
|
|
|
VI | 1,164 |
|
|
| 1,164 |
| 1,164 |
|
The study used the bootstrapping technique with 1,000 resamples to determine the statistical significance of the path coefficients. The hypothesis testing results are summarized in Table 4. The findings show that 14 out of 15 direct hypotheses are supported, with all significant paths having t-values greater than 1.96 and p-values less than 0.05. Only the relationship between social influence (SI) and pleasure (PL) is not supported due to a lack of statistical significance. Overall, the majority of the proposed relationships in the research model are empirically confirmed.
Table 4. Hypothesis Testing
Hypothesis | Relationship | β | P-value | Result |
H1 | VIà AR | 0.115 | 0.029 | Accept |
H2 | VIà PL | 0.114 | 0.015 | Accept |
H3 | VIà TR | 0.165 | 0.000 | Accept |
H4 | SIà AR | 0.132 | 0.048 | Accept |
H5 | SIà PL | 0.070 | 0.246 | Not accepted |
H6 | SIà TR | 0.165 | 0.038 | Accept |
H7 | ENà AR | 0.206 | 0.002 | Accept |
H8 | ENà PL | 0.156 | 0.016 | Accept |
H9 | ENà TR | 0.222 | 0.000 | Accept |
H10 | INà AR | 0.336 | 0.000 | Accept |
H11 | INà PL | 0.313 | 0.000 | Accept |
H12 | INà TR | 0.306 | 0.000 | Accept |
H13 | AR à OIB | 0.123 | 0.023 | Accept |
H14 | PLà OIB | 0.166 | 0.002 | Accept |
H15 | TRà OIB | 0.180 | 0.000 | Accept |
The analysis results in Table 5 show the coefficients as follows: TR (0.223); AR (0.215); PL (0.179) and OIB (0.117). According to Garson's (2016) standard, the explanatory power of the model reaches a weak to moderate threshold. Although the
of impulse buying behavior (OIB) only reaches 11.7%, this result still has practical significance, confirming that factors from short videos are important drivers of consumers' impulse buying behavior in the online environment. The relatively low R² suggests that impulse buying behavior may also be influenced by external factors such as price sensitivity, financial constraints, or situational triggers, which were not included in this model.
Table 5 . R² and adjusted R² results of the structural model
R Square | Adjusted R Square | |
AR | 0.215 | 0.207 |
OIB | 0.117 | 0.11 |
PL | 0.179 | 0.171 |
TR | 0.223 | 0.215 |
4.4. Mediating effect test
According to Hair et al. (2018), Bootstrapping was used to test the indirect effects in the model. The results indicate that Trust mediates the relationship between certain short video factors and impulse buying behavior. Specifically, EN (β = 0.040; p = 0.004), IN (β = 0.055; p = 0.008), and VI (β = 0.030; p = 0.008) all have indirect effects on OIB through trust. Additionally, pleasure mediates the relationship between IN and OIB (β = 0.052; p = 0.011). Conversely, the indirect effects through arousal and the remaining paths through pleasure and trust did not reach statistical significance (p > 0.05), indicating no mediating role in these relationships (Appendix 3).
4.5. Discussion
Through structural equation modeling, the study confirmed the relationship between short video characteristics and impulse buying behavior (OIB) based on the S-O-R theoretical framework. Results showed that 14 out of 15 hypotheses were supported by empirical data, demonstrating the strong influence of digital content on consumer psychology. This high level of support indicates that short-video stimuli are capable of simultaneously activating multiple internal states, rather than influencing behavior through a single pathway. The R² value for impulse buying behavior is relatively low, indicating that the explanatory power of the model is limited. This suggests that, beyond short-video characteristics, other factors such as price sensitivity, financial constraints, or individual differences may also significantly influence impulse buying behavior. As such, impulse buying should be understood as a complex phenomenon driven by diverse situational contexts and evolving consumer needs within a rapidly advancing technological environment.
Specifically, visual appeal (VI) positively affects arousal (H1: β = 0.115, p = 0.029), pleasure (H2: β = 0.114, p = 0.015), and trust (H3: β = 0.165, p = 0.000). Compared to prior studies (Ngo et al., 2024; Bui, 2025), this finding confirms that visual elements not only attract attention but also shape both emotional and cognitive responses. This may be because visually appealing content reduces initial resistance and creates a sense of familiarity, which in turn lowers cognitive barriers and facilitates heuristic-based trust formation in fast-scrolling short-video environments. Similarly, entertainment (EN) shows a positive impact on arousal (H7: β = 0.206, p = 0.002), pleasure (H8: β = 0.156, p = 0.016), and trust (H9: β = 0.222, p = 0.000), consistent with Hoyer et al. (2020) and Huang et al. (2022). Notably, its effect on arousal is stronger than visual appeal, indicating that entertainment plays a more active role in stimulating psychological excitement. This can be explained by the immersive nature of entertaining content, which keeps users engaged and lowers cognitive resistance to commercial messages.
Informativeness (IN) is identified as the most influential factor affecting arousal (H10: β = 0.336), pleasure (H11: β = 0.313), and trust (H12: β = 0.306) with p < 0.001. Compared to other factors, its stronger coefficients indicate that consumers in short-video environments rely heavily on information to interpret content and reduce uncertainty. This suggests that in short-video, consumers do not rely solely on affective cues but actively seek informational signals to reduce uncertainty, especially in environments characterized by rapid content consumption and limited attention. Conversely, social influence (SI) significantly affects arousal (H4) and trust (H6), but not pleasure (H5: β = 0.070, p = 0.246), which differs from Ngo et al. (2024). This finding challenges the common assumption that social signals are inherently associated with enjoyment, suggesting instead that social influence in digital environments may function more as a credibility heuristic than a source of intrinsic pleasure.
Finally, hypotheses H13, H14, and H15 confirm that arousal (β = 0.123, p = 0.023), pleasure (β = 0.166, p = 0.002), and trust (β = 0.180, p = 0.000) all lead to impulse buying behavior. Among these, trust has the strongest effect, indicating that while emotional responses initiate purchase impulses, cognitive assurance is required to convert them into actual behavior. This highlights a two-stage mechanism in impulse buying behavior, where emotional arousal initiates the impulse, but cognitive validation through trust determines whether the behavior is ultimately executed. This result aligns with the Vietnamese context, where consumers actively engage with short-video content but remain cautious in online transactions. Emotional responses may attract users initially, but actual purchasing behavior is more likely to occur when trust is established. Therefore, trust plays a central role in bridging emotional reactions and behavioral outcomes, reinforcing its importance within the S-O-R framework in digital commerce settings.
5. Conclusion and implications
The study clarified the mechanism of short videos' impact on consumers' online impulsive buying (OIB) behavior in Vietnam, through the combination of the S-O-R model and the PAD emotional state. The results show that stimuli from short videos have a significant impact on arousal and pleasure, two core components of the PAD model, while also influencing consumer trust. In particular, informativeness (IN) was identified as the most important precursor to internal psychological states, indicating that consumers particularly value the practical value of content alongside entertainment factors. The findings provide several important theoretical and managerial implications.
From a theoretical perspective, this study extends the S-O-R framework by integrating both emotional (arousal and pleasure) and cognitive (trust) mechanisms as parallel mediators in explaining impulse buying behavior in short-video commerce. The results highlight that impulse buying is not solely driven by emotional stimulation but also requires cognitive validation through trust, thereby offering a more comprehensive understanding of consumer decision-making processes in digital environments. From a managerial perspective, the findings suggest that businesses should move beyond purely entertainment-driven content and prioritize informativeness as a key strategic element. Specifically, delivering clear, concise, and relevant product information within the first few seconds of a video can effectively reduce consumer uncertainty and enhance both emotional engagement and trust formation. Furthermore, the results indicate that trust plays a critical role in converting emotional responses into actual purchase behavior. Therefore, businesses should focus on enhancing content credibility through authentic product representation, user-generated content, and credible endorsements, rather than relying solely on exaggerated promotional techniques. Finally, while entertainment and visual appeal remain important for capturing attention, their effectiveness is significantly enhanced when combined with strong informational and trust-building elements. This suggests that an optimal short-video strategy should balance emotional engagement with cognitive assurance to effectively stimulate impulse buying behavior.
Although certain results have been achieved, this study has several limitations that should be acknowledged. First, although the non-probability sampling method offers a certain level of reliability, it remains subject to limitations in generalizability, as respondents may provide subjective or socially desirable answers. Second, the cross-sectional design limits the ability to establish causal relationships between variables, as data were collected at a single point in time. Third, the sample is skewed toward female respondents; although this may limit generalizability, it accurately reflects the primary driving force of short-video e-commerce in Vietnam. In addition, future studies may incorporate additional variables such as price sensitivity, financial constraints, or individual characteristics to better explain impulse buying behavior, given the relatively low explanatory power of the current model. Future research may also examine different short-video formats, such as livestream or longer-form content, to explore whether the roles of emotional and cognitive mechanisms vary across contexts